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Twin-Arginine Translocation in Bacillus 

CROSS-REFERENCE TO RELATED APPLICATIONS 
Pursuant to 35 U.S.C. §1 19(e), the present application claims benefit of 

and priority to US Application No. 60/233,610, entitled 'Twin-Arginine 

Translocation in Bacillus", filed September 18, 2001 by Jongbloed et al. 

FIELD OF THE INVENTION 
The present invention generally relates to expression of proteins in a host 
cell. The present invention provides expression vectors, methods and systems 
for the production of proteins in a host cell. 

BACKGROUND OF THE INVENTION . 
Eubacteria export numerous proteins across the plasma membrane into 
either the periplasmic space (Gram-negative species), or the growth medium 
(Gram-positive species). The Gram-positive eubacterium Bacillus subtilis and, in 
particular, its close relatives Bacillus amyloliquefaciens and Bacillus licheniformis 
are well known for their high capacity to secrete proteins (at gram per liter 
concentrations) into the medium. This property, which allows the efficient 
separation of (secreted) proteins from the bulk cytoplasmic protein complement, 
has led to the commercial exploitation of the latter bacilli as important "cell 
factories." Despite their high capacity to secrete proteins of Gram-positive origin, 
the secretion of recombinant proteins from Gram-negative eubacterial or 
eukaryotic origin by Bacillus species is often inefficient. This can be due to a 
variety of (potential) bottlenecks in the secretion pathway, such as poor targeting 
to the membrane, pre-translocational folding, inefficient translocation, slow or 
incorrect post-translocational folding of the secretory protein, and proteolysis. 
Notably, many o these problems relate to the specific properties of the general 
secretory (Sec) pathway that was, so far, used in all documented attempts to 
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apply bacilli for the secretion of heterologous proteins of commercial or 
biomedical value. 

General strategies for the secretion of heterologous proteins by bacilli are 
based on the in-frame fusion of the respective protein with an amino-terminal 
signal peptide that directs this protein into the Sec-dependent secretion pathway. 
Upon translocation across the membrane, the signal peptide is removed by a 
signal peptidase, which is a prerequisite for the release of the translocated protein 
from the membrane, and its secretion into the medium. As exemplified with 
human interleukin-3, which is secreted by B. licheniformis at gram per liter 
concentrations, this strategy allows protein production at commercially significant 
levels. 

' Two major hurdles have been identified for the secretion of heterologous 
proteins via the Sec-dependent route. The first one is the translocation process 
by the Sec machinery, which is composed of a proteinaceous channel in the 
membrane (consisting of SecY, SecE, SecG and SecDF-YrbF) and a 
translocation motor (SecA). The Sec machinery is known to 'thread' its substrates 
in an unfolded state through the membrane. Consequently, this machinery is 
inherently incapable of translocating proteins that fold in the cytosol. A second 
bottleneck has been identified for other heterologous proteins that are 
translocated correctly but fold slowly or incorrectly in the cell wall environment, 
probably because this compartment lacks the appropriate chaperone molecules 
to assist in their folding. Molecular chaperones of the Hsp60 and Hsp70 classes 
are essential for the folding of many proteins, but these are all absent from 
bacterial extracytoplasmic compartments. As the membrane-cell wall 
environment of bacilli is highly proteolytic, slowly or incorrectly folding translocated 
proteins are often degraded before being secreted into the medium. 
Consequently, protein secretion via the Sec pathway is a highly efficient tool for 
the production of only a subset of heterologous proteins. 
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Protein production and secretion from Bacillus species is a major 
production tool with a market of over $1 billion per year. However, as noted 
above, the standard export technologies, based on the well-characterized general 
secretory (Sec) pathway, are frequently inapplicable for the production of proteins. 
Thus, it would be beneficial to have an alternative mechanism for the production 
and secretion proteins. 

SUMMARY OF THE INVENTION 
Provided herein are methods for the production of peptides in a host cell. 

In one aspect of the invention, the host cell is a gram-positive micro- 
organism. The gram-positive microorganism is preferably a member of the genus 
Bacillus. In a more preferred embodiment the host cell is Bacillus subtilis. 

In another aspect of the invention, the host cell is a gram-negative micro- 
organism. The gram-negative microorganism is preferably a member of the 
genus Pantoaea, preferably Pantoaea citrea. The gram-negative microorganism 
is preferably Escherichia coli. 

The present invention also provides methods for increasing secretion of 
proteins from host microorganisms. In one embodiment of the present invention, the 
protein is homologous or naturally occurring in the host microorganism. In another 
embodiment of the present invention, the protein is heterologous to the host 
microorganism. Accordingly, the present invention provides a method for increasing 
secretion of a protein in a host cell using an expression vector comprising nucleic 
acid tatCd wherein said tatCd is under the control of expression signals capable of 
expressing said secretion factor in a host microorganism; introducing the expression 
vector into a host microorganism capable of expressing said protein and culturing 
said microorganism under conditions suitable for expression of said secretion factor 
and secretion of said protein. 

The present invention provides expression vectors and host cells 
comprising a nucleic acid encoding a TatCd and/or TatA. In one embodiment of 
the present invention, the host cell is genetically engineered to produce a desired 
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protein, such as an enzyme, growth factor or hormone. In yet another 
embodiment of the present invention, the enzyme is selected from the group 
consisting of proteases, carbohydrases including amylases, cellulases, xylanases, 
and lipases; isomerases such as racemases, epimerases, tautomerases, or 
mutases; transferases, kinases and phophatases acylases, amidases, esterases, 
oxidases. In a further embodiment the expression of the secretion factor TatCd is 
coordinated with the expression of other components of the secretion machinery. 
Preferably other components of the secretion machinery, i.e., TatA and/or other 
secretion factors identified in the future are modulated in expression at an optimal 
ratio to TatCd. For example, it may be desired to overexpress multiple secretion 
factors in addition TatCd for optimum enhancement of the secretion machinery. 

The present invention also provides a method of identifying homologous 
gram positive microorganism TatCd that comprises hybridizing part or all of 8. 
subtilis TatCd nucleic acid shown in Figure 1 with nucleic acid derived from gram- 
positive microorganisms. In one embodiment, the nucleic acid is of genomic 
origin. In another embodiment, the nucleic acid is a cDNA. The present 
invention encompasses novel gram-positive microorganism secretion factors 
identified by this method. 

Other objects, features and advantages of the present invention will become 
apparent from the following detailed description. It should be understood, however, 
that the detailed description and specific examples, while indicating preferred 
embodiments of the invention, are given by way of illustration only, since various 
changes and modifications within the scope and spirit of the invention will become 
apparent to one skilled in the art from this detailed description. 

BRIEF DESCRIPTION OF THE FIGURES 
Fig. 1 . Tat components of B. subtilis and E. co/i. The amino acid 
sequences of Tat components of B. subtilis and E. coli as deduced from the 
SubtiList (http://bioweb.pasteur.fr/Genolist/ Subtil_ist.html) and Colibri 
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(http:/bioweb. pasteur.fr/ Genolist/Colibri.html) databases were used for 
comparisons. Identical amino acids [*], or conservative replacements [.] are 
marked. Putative transmembrane segments, indicated in gray shading, were 
predicted with the TopPred2 algorithm (34, 35) (A) Comparison of TatAc (YnzA), 
TatAd (YczB) and TatAy (Ydil) of B. subtilis (Bsu) with TatA, TatB and TatE of E. 
coli (Eco). (B) Comparison of TatCd (YcbT) and TatCy (YdiJ) of B. subtilis with 
TatC of E. coL 

Fig. 2. The tatAC regions of B. subtilis and £. coft. (A) Chromosomal 
organization of the B. subtilis tatAd-tatCd and tatAy-tatCy regions (adapted from 
the SubtiList database). Note that the fafAdand tatCd genes are located 
downstream of the phoD gene. (B) Chromosomal organization of the E. coli 
tatABCD region (adapted from the Colibri database). 

Fig. 3. Construction of tatC mutant strains of B. subtilis. (A) Schematic 
presentation of the construction of B. subtilis AtatCd and B. subtilis MatCy. The 
chromosomal fafCdgene was disrupted with a kanamycin resistance marker (Km r 
) by homologous recombination. To this purpose, B. subtilis 1 68 was transformed 
with plasmid pJCd2, which cannot replicate in B. subtilis, and contains a mutant 
copy of the tatCd gene with an internal Bcl\-Acc\ fragment replaced by a Km r 
marker. The chromosomal tatCy gene was disrupted with a spectinomycin 
resistance marker (Sp r ) by homologous recombination. To this purpose, B. 
subtilis 1 68 was transformed with plasmid pJCy2, which cannot replicate in B. 
subtilis, and contains a mutant copy of the fafCy gene with a Sp r marker in the 
Psti site. Only restriction sites relevant for the construction are shown. tatCd', 5' 
end of the tatCd gene; 'tatCd, 3' end of the tatCd gene; farCy', 5' end of the 
tatCy gene; 'farCy, 3' end of the fafCy gene. (B) Schematic presentation of the 
tatCd region of B. subtilis \tatCd. By a Campbell-type integration of the pMutin2- 
derivative pMICdl into the B. subtilis 168 chromosome, the fafCdgene was 
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placed under the control of the IPTG-dependent Pspac promoter, which can be 
repressed by the product of the /ac/ gene. Simultaneously, the spoVG-lacZ 
reporter gene of pMutin2 was placed under the transcriptional control of the tatCd 
promoter region. PCR-amplified regions are indicated with black bars. Ori 
pBR322, replication functions of pBR322; Ap r , ampicillin resistance marker; Em r , 
erythromycin resistance marker; fa/Cd', 3' truncated tatCd gene; T1T2, 
transcriptional terminators on pMutin2. (C) Schematic presentation of the tatCy 
region of B. subtilis ItatCy. By a Campbell-type integration of the pMutin2- 
derivative pMICyl into the B. subtilis 168 chromosome, the tatCy gene was 
placed under the control of the IPTG-dependent Pspac promoter. 
Simultaneously, the spoVQ-lacZ reporter gene of pMutin2 was placed under the 
transcriptional control of the tatCy promoter region. fafCy', 3' truncated tatCy 
gene. 

Fig. 4. TatCd is required for secretion of PhoD. B. subtilis 168 (parental 
strain), B. subtilis AtatCd, B. subtilis Matty, or B. subtilis AtatCd-AtatCy were 
grown under conditions of phosphate starvation, using LPDM medium. To study 
the secretion of PhoD (A) or PhoB (B), B. subtilis cells were separated from the 
growth medium by centrifugation. Secreted PhoD and PhoB in the growth 
medium were visualized by SDS-PAGE and Western blotting, using PhoD- or 
PhoB-specific antibodies. (C) Cells of B. subtilis 168 and B. subtilis UatCd-AtatCy 
were grown under conditions of phosphate starvation, in LPDM medium. Next, 
cells and growth medium were separated by centrifugation, and PhoD was 
visualised by SDS-PAGE and Western blotting, using PhoD-specific antibodies. 

Fig 5. Two-dimensional gel etectrophoretic analysis of the TatC- 
dependent secretion of PhoD. B. subtilis 168 or B. subtilis AtatCd-AtatCy, were 
grown under conditions of phosphate starvation in LPDM medium. Secreted 
proteins were analysed by two-dimensional gel electrophoresis as indicated in the 
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Experimental Procedures section. The names of proteins identified by mass 
spectrometry are indicated. 

Fig 6. TatC-dependent secretion of the B. subtilis lipase LipA. B. 

subtilis 168 (parental strain), B. subtilis MatCd, B. subtilis AtatCy, or B. subtilis 
htatCd-MatCywere grown in TY-medium to end-exponential growth fase. To 
study the secretion of LipA, B. subtilis cells were separated from the growth 
medium by centrifugation. Proteins in the growth medium were concentrated 20- 
fold upon precipitation with trichloroacetic acid, and samples for polyacrylamide 
gel electrophoresis (SDS-PAGE) were prepared. Secreted LipA in the growth 
medium was visualized by SDS-PAGE and Western blotting, using LipA-specific 
antibodies. 

Fig 7. Predicted twin-arginine (RR-)signal peptides of B. subtilis. The 

listed signal peptides contain, in addition to the twin-arginines, at least one other 
residue of the consensus sequence (R-R-X-^; printed in bold). The number of 
residues in the N- and H-domains of each signal peptide, and the average 
hydrophobicity (h) of each of these domains, as determined by the algorithms of 
Kyte and Doolittle (Kyte, J., and R. F. Doolittle [1982] A simple method for 
displaying the hydropathic character of a protein. J. Mol. Biol. 157:105-32), are 
indicated. Furthermore, the RR-motifs in the N-domain, and SPase I recognition 
sites in the C-domain (/e. positions -3 to -1 relative to the predicted SPase cleavage 
site) are shown. Proteins lacking a (putative) SPase I cleavage site, some of 
which contain additional transmembrane domains, are indicated with "™". One 
protein containing cell wall binding repeats is indicated with " w ". 

FIG. 8. Processing of prePhoD in E. co//TG1. (A) E. coli TG1 1 carrying 
plasmid pARphoD, encoding wild type PhoD was grown in M9 minimal medium to 
early logarithmic phase. 1 hour prior labelling expression of phoD was induced 
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with IPTG (1 nriM). Cells were labelled for 1 min with [35S]-methionine, after 
which non-radioactive methionine was added. Samples were withdrawn at chase 
times 10, 20, 40 and 60 min and subjected to immunoprecipitation with 
monospecific antibodies against PhoD, followed by SDS-PAGE using a 10% 
polyacrylarnide gel and fluorography. M, molecular weight marker; Glu, 
uninduced control. (B) In vivo protease mapping of PhoD in Ecoli 
TG1 (pARSphoD). Cells were converted to spheroplasts and treated with 
proteinase K, proteinase K and Trition X-100 or remained untreated as indicated. 
Localisation of prePhoD is indicated. Accessibility of proteinase K to the cytosol 
was analysed by monitoring SeeB in a 15% polyacryaniid gel. PhoD and SecB 
were detected by monospecific antibodies 

FiG. 9. Induction and processing Of SP B i a -PhoD in E. co//TG1 . (A) E. coli 
TG1 (pMVT\N2bla-phoD) was grown in TY medium to logarithmic growth phase. 
Expression of bla-phoD was induced with IPTG (1 mM, lanes 2- 4) or remained 
uninduced (lane 1). At the time of induction cultures were treated with sodium 
azide (3 mM, lane 3), with nigericin (1 u.M, lane 4) or remained untreated (lane 2). 
Samples were taken 20 min after induction of SP B i a -PhoD s lysed and cell extracts 
were analysed by SDS-PAGE using 10 % polyacrylarnide. B and C, 
TG1 (pMUTIN2b/a-p/7oD) was grown in M9 minimal medium to early logarithmic 
phase. 1 hour prior labelling expression of phoD was induced with IPTG (1mM). 
While one culture remained untreated (B), the other was treated with sodium 
azide (3 mM) upon induction (C). Cells were labelled for 1 min with [35S]- 
methionine, after which non-radioactive methionine was added. Samples were 
withdrawn at times after chase as indicated in the figures and subjected to 
immunoprecipitation with antibodies against PhoD, followed by SDS-PAGE using 
a 12.5% polyacrylarnide gel and fluorography. Localisation of SP B i a -PhoD and 
mature PhoD is indicated. [14C]-labelled molecular weight marker. 
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FIG. 10. Localisation of SPp h0 D-LacZ in E. coli TG1 in absence or presence 
of B. subtilis tatAd/Cd. E. co//TG1 strains carrying either plasmid pAR3phoD-lacZ 
(A) or plasmids pAR3phoD-lacZ, pREP4 and pQEQ tatAd/Cd (B) were grown in TY 
medium to exponential growth and expression of phoD-lacZ and tatAd/Cd were 
induced for 1 hour with arabinose (0.2 %) and IPTG (1mM), respectively. 
Subcellular localisation of SP Ph0 D-LacZ was detected by in vivo protease mapping 
according to Fig 8B. SP Ph0 D-LacZ and SecB were monitored by antisera against 
LacZ and SecB. Bands representing SP Ph0 D-LacZ, LacZ and SecB are indicated. 

FiG. 11. Processing of SP P hoD-LacZ in E. co//TG1 co-expressing B. subtilis 
tatAd/Cd. E. coli strains TG1(pAR3phoD-lacZ) (A) and TG1(pAR3phoD-lacZ, 
pREP4, pQE9fafAd/Cd) (B) were grown in M9 minimal medium to early 
logarithmic phase and labelled for 1 min With [35S]-methionine and subsequently 
chased with non-radioactive methionine. Samples were taken at the indicated 
chase times and further processed by immunoprecipitation with antiserum against 
LacZ, followed by SDS-PAGE using a 7.5 % polyacrylamide gel and fluorography. 
Bands representing SPphoD-LacZ and LacZ are indicated. M, [14C]-labelled • 
molecular weight marker. 

FiG. 1 2 TatAd/Cd-mediated transport of SP Ph0 D-LacZ in E. coli\s ApH- 
dependent. E. coli TG1(p AR3phoD-lacZ, pREP4, pQEdtatAd/Cd) was grown in 
TY medium to exponential growth, nigericin (1 uM) (A) or sodium azide (3.mM) (B) 
were added to the cultures prior induction of gene expression. Localisation of 
LacZ was analysed by in vivo protease mapping as described in Fig. 10. 
Samples were submitted to immunological detection of LacZ with specific 
antibodies. Bands representing SP Ph oD-LacZ, LacZ and SecB are indicated. 

FiG. 1 3 Localisation of SP P hoD-LacZ in E. coli strain depleted for taiABCDE. 
E. coli strain TG1Atat,4BCDE{pAR3phoD-lacZ, pREP4 and pQE9tatAd/Cd) was 
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grown in TY medium, synthesis of SP Ph0 D-LacZ and TatAd/Cd were induced and 
subjected to in vivo protease mapping as described in Fig. 10. LacZ and SecB 
were visualised by SDS-PAGE and Western blotting, 

FIG 14 Homologs of B. clausii. B subtilis sequences were used to BLAST 
search an in-house database of B. clausii genome. 

DETAILED DESCRIPTION OF THE INVENTION 
The invention will now be described in detail by way of reference only using 
the following definitions and examples. All patents and publications, including all 
sequences disclosed within such patents and publications, referred to herein are 
expressly incorporated by reference. 

Unless defined otherwise herein, all technical and scientific terms used 
herein have the same meaning as commonly understood by one of ordinary skill 
in the art to which this invention belongs. Singleton, ef a/., Dictionary of 
Microbiology and Molecular Biology, 2d Ed., John Wiley and Sons, New York 
(1 994), and Hale & Marham, The Harper Collins Dictionary of Biology, Harper 
Perennial, NY (1991) provide one of skill with a general dictionary of many of the 
terms used in this invention. Although any methods and materials similar or 
equivalent to those described herein can be used in the practice or testing of the 
present invention, the preferred methods and materials are described. Numeric 
ranges are inclusive of the numbers defining the range. Unless otherwise 
indicated, nucleic acids are written left to right in 5' to 3' orientation; amino acid 
sequences are written left to right in amino to carboxy orientation, respectively. 
Practitioners are particularly directed to Sambrook et ah, 1989, and Ausubel FM 
ef al., 1 993, for definitions and terms of the art. It is to be understood that this 
invention is not limited to the particular methodology, protocols, and reagents 
described, as these may vary. 

The headings provided herein are not limitations of the various aspects or 
embodiments of the invention which can be had by reference to the specification 
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as a whole. Accordingly, the terms defined immediately below are more fully 
defined by reference to the specification as a whole. 

As used herein, the genus Bacillus includes all members known to those of 
skill in the art, including but not limited to B. subtilis, B. licheniformis, B. lentus, B. 
brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. 
coagulans, B. circulans, B. lautus and B. thuringiensis. 

The term "polypeptide" as used herein refers to a compound made up of a 
single chain of amino acid residues linked by peptide bonds. The term "protein" as 
used herein may be synonymous with the term "polypeptide" or may refer, in 
addition, to a complex of two or more polypeptides. 

The term "chimeric polypeptide" and "fusion polypeptide" are used 
interchangeably herein and refer to a signal peptide from phoD or lipA linked to the 
protein of interest or heterologous protein. 

A "signal peptide" as used herein refers to an amino-terminal extension on 
a protein to be secreted. Nearly all secreted proteins use an amino-terminal 
protein extension which plays a crucial role in the targeting to and translocation of 
precursor proteins across the membrane and which is proteolytically removed by 
a signal peptidase during or immediately following membrane transfer. 

As used herein, a "protein of interest" or "polypeptide of interest" refers to 
the protein to be expressed and secreted by the host cell. The protein of interest 
may be any protein which up until now has been considered for expression in 
prokaryotes. The protein of interest may be either homologous or heterologous to 
the host. In the first case overexpression should be read as expression above 
normal levels in said host. In the latter case basically any expression is of course 
overexpression. 

The terms "isolated" or "purified" as used herein refer to a nucleic acid or 
amino acid that is removed from at least one component with which it is naturally 
associated. 

As used herein, the term "heterologous protein" refers to a protein or 
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polypeptide that does not naturally occur in a host cell. Examples of heterologous 
proteins include enzymes such as hydrolases including proteases, cellulases, 
amylases, other carbohydrases, and lipases; isomerases such as racemases, 
epimerases, tautomerases, or mutases; transferases, kinases and phophatases. 
The heterologous gene may encode therapeutically significant proteins or 
peptides, such as growth factors, cytokines, ligands, receptors and inhibitors, as 
well as vaccines and antibodies. The gene may encode commercially important 
industrial proteins or peptides, such as proteases, carbohydrases such as 
amylases and glucoamylases, cellulases, oxidases and lipases. The gene of 
interest may be a naturally occurring gene, a mutated gene or a synthetic gene. 

The term "homologous protein" refers to a protein or polypeptide native or 
naturally occurring in a host cell. The invention includes host cells producing the 
homologous protein via recombinant DNA technology. The present invention 
encompasses a host cell having a deletion or interruption of the nucleic acid 
encoding the naturally occurring homologous protein, such as a protease, and 
having nucleic acid encoding the homologous protein re-introduced in a 
recombinant form. In another embodiment, the host cell produces the 
homologous protein. 

The term "nucleic acid molecule" includes RNA, DNA and cDNA 
molecules. It will be understood that, as a result of the degeneracy of the genetic 
code, a multitude of nucleotide sequences encoding a given protein such as TatC 
and/or TatA may be produced. The present invention contemplates every 
possible variant nucleotide sequence, encoding TatC and/or TatA, all of which are 
possible given the degeneracy of the genetic code. 

A "heterologous" nucleic acid construct or sequence has a portion of the 
sequence which is not native to the cell in which it is expressed. Heterologous, 
with respect to a control sequence refers to a control sequence {i.e. promoter or 
enhancer) that does not function in nature to regulate the same gene the 
expression of which it is currently regulating. Generally, heterologous nucleic acid 



WO 02/22667 



13 



PCT/US01/29151 



sequences are not endogenous to the cell or part of the genome in which they are 
present, and have been added to the cell, by infection, transfection, 
microinjection, electroporation, or the like. A "heterologous" nucleic acid construct 
may contain a control sequence/DNA coding sequence combination that is the 
same as, or different from a control sequence/DNA coding sequence combination 
found in the native cell. 

As used herein, the term "vector" refers to a nucleic acid construct 
designed for transfer between different host cells. An "expression vector" refers 
to a vector that has the ability to incorporate and express heterologous DNA 
fragments in a foreign cell. Many prokaryotic and eukaryotic expression vectors 
are commercially available. Selection of appropriate expression vectors is within 
the knowledge of those having skill in the art. 

Accordingly, an "expression cassette" or "expression vector" is a nucleic 
acid construct generated recombinantly or synthetically, with a series of specified 
nucleic acid elements that permit transcription of a particular nucleic acid in a 
target cell. The recombinant expression cassette can be incorporated into a 
plasmid, chromosome, mitochondrial DNA, plastid DNA, virus, or nucleic acid 
fragment. Typically, the recombinant expression cassette portion of an 
expression vector includes, among other sequences, a nucleic acid sequence to 
be transcribed and a promoter. 

As used herein, the term "plasmid" refers to a circular double-stranded (ds) 
DNA construct used as a cloning vector, and which forms an extrachromosomal 
self-replicating genetic element in many bacteria and some eukaryotes. 

As used herein, the term "selectable marker-encoding nucleotide 
sequence" refers to a nucleotide sequence which is capable of expression in 
mammalian cells and where expression of the selectable marker confers to cells 
containing the expressed gene the ability to grow in the presence of a 
corresponding selective agent. 

As used herein, the term "promoter" refers to a nucleic acid sequence that 
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functions to direct transcription of a downstream gene. The promoter will 
generally be appropriate to the host cell in which the target gene is being 
expressed. The promoter together with other transcriptional and translational 
regulatory nucleic acid sequences (also termed "control sequences") are 
necessary to express a given gene. In general, the transcriptional and 
translational regulatory sequences include, but are not limited to, promoter 
sequences, ribosomal binding sites, transcriptional start and stop sequences, 
translational start and stop sequences, and enhancer or activator sequences. 

"Chimeric gene" or "heterologous nucleic acid construct", as defined herein 
refers to a non-native gene (i.e., one that has been introduced into a host) that 
may be composed of parts of different genes, including regulatory elements. A 
chimeric gene construct for transformation of a host cell is typically composed of a 
transcriptional regulatory region (promoter) operably linked to a heterologous 
protein coding sequence, or, in a selectable marker chimeric gene, to a selectable 
marker gene encoding a protein conferring antibiotic resistance to transformed 
cells. A typical chimeric gene of the present invention, for transformation into a 
host cell, includes a transcriptional regulatory region that is constitutive or 
inducible, a signal peptide coding sequence, a protein coding sequence, and a 
terminator sequence. A chimeric gene construct may also include a second DNA 
sequence encoding a signal peptide if secretion of the target protein is desired. 

A nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For example, DNA encoding a 
secretory leader is operably linked to DNA for a polypeptide if it is expressed as a 
preprotein that participates in the secretion of the polypeptide; a promoter or 
enhancer is operably linked to a coding sequence if it affects the transcription of 
the sequence; or a ribosome binding site is operably linked to a coding sequence 
if it is positioned so as to facilitate translation. Generally, "operably linked" means 
that the DNA sequences being linked are contiguous, and, in the case of a 
secretory leader, contiguous and in reading phase. However, enhancers do not 
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have to be contiguous. Linking is accomplished by ligation at convenient 
restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors 
or linkers are used in accordance with conventional practice. 

As used herein, the term "gene" means the segment of DNA involved in 
producing a polypeptide chain, that may or may not include regions preceding and 
following the coding region, e.g. 5' untranslated (5 1 UTR) or "leader" sequences 
and 3' UTR or "trailer" sequences, as well as intervening sequences (introns) 
between individual coding segments (exons). 

A nucleic acid sequence is considered to be "selectively hybridizable" to a 
reference nucleic acid sequence if the two sequences specifically hybridize to one 
another under moderate to high stringency hybridization and wash conditions. 
Hybridization conditions are based on the melting temperature (Tm) of the nucleic 
acid binding complex or probe. For example, "maximum stringency" typically 
occurs at about Tm-5°C (5° below the Tm of the probe); "high stringency" at about 
5-1 0° below the Tm; "intermediate stringency" at about 1 0-20° below the Tm of 
the probe; and "low stringency" at about 20-25° below the Tm. Functionally, 
maximum stringency conditions may be used to identify sequences having strict 
identity or near-strict identity with the hybridization probe; while high stringency 
conditions are used to identify sequences having about 80% or more sequence 
identity with the probe. 

Moderate and high stringency hybridization conditions are well known in 
the art (see, for example, Sambrook, et al, 1 989, Chapters 9 and 1 1 , and in 
Ausubel, F.M., ef al., 1993, expressly incorporated by reference herein). An 
example of high stringency conditions includes hybridization at about 42°C in 50% 
formamide, 5X SSC, 5X Denhardt"s solution, 0.5% SDS and 100 [ig/ml denatured 
carrier DNA followed by washing two times in 2X SSC and 0.5% SDS at room 
temperature and two additional times in 0.1 X SSC and 0.5% SDS at 42°C. 

As used herein, "recombinant" includes reference to a cell or vector, that 
has been modified by the introduction of a heterologous nucleic acid sequence or 
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that the cell is derived from a cell so modified. Thus, for example, recombinant 
cells express genes that are not found in identical form within the native (non- 
recombinant) form of the cell or express native genes that are otherwise 
abnormally expressed, under expressed or not expressed at all as a result of 
deliberate human intervention. 

As used herein, the terms "transformed", "stably transformed" or 
"transgenic" with reference to a cell means the cell has a non-native 
(heterologous) nucleic acid sequence integrated into its genome or as an 
episomal plasmid that is maintained through two or more generations. 

As used herein, the term "expression" refers to the process by which a 
polypeptide is produced based on the nucleic acid sequence of a gene. The 
process includes both transcription and translation. 

The term "introduced" in the context of inserting a nucleic acid sequence 
into a cell, means "transfection", or "transformation" or "transduction" and includes 
reference to the incorporation of a nucleic acid sequence into a eukaryotic or 
prokaryotic cell where the nucleic acid sequence may be incorporated into the 
genome of the cell (for example, chromosome, plasmid, plastid, or mitochondrial 
DNA), converted into an autonomous replicon, or transiently expressed (for 
example, transfected mRNA). 

The present invention provides novel gram-positive microorganism 
secretion factors and methods that can be used in microorganisms to ameliorate 
the bottleneck to protein secretion and the production of proteins in secreted 
form, in particular when the proteins are recombinantly introduced and 
overexpressed by the host cell. The present invention provides the secretion 
factors TatC and TatA derived from Bacillus subtilis. In particular, the TatCd and 
TatCy peptide, as well as the genes encoding them, are described herein. 

The recent discovery of a ubiquitous translocation pathway, specifically 
required for proteins with a twin-arginine motif in their signal peptide, has focused 
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interest on its membrane-bound components, one of which is known as TatC. 
Unlike most organisms of which the genome has been sequenced completely, the 
Gram-positive eubacterium Bacillus subtilis contains two fafC-like genes, denoted 
tatCd and tatCy. The corresponding TatCd and TatCy proteins have the potential 
to be involved in the translocation of 27 proteins with putative twin-arginine signal 
peptides of which about 6 to 14 are likely to be secreted into the growth medium. 
Using a proteomic approach, we show that PhoD of B. subtilis, a 
phosphodiesterase belonging to a novel protein family of which all known 
members are synthesized with typical twin-arginine signal peptides, is secreted 
via the twin-arginine translocation pathway. Strikingly, TatCd is of major 
importance for the secretion of PhoD, whereas TatCy is not required for this 
process. Thus, TatC appears to be a specificity determinant for protein secretion 
via the Tat pathway. Based on our observations, we hypothesize that the TatC- 
determined pathway specificity is based on specific interactions between TatC- 
like proteins and other pathway components, such as TatA, of which three 
paralogues are present in B. subtilis: 

Tat Nucleic Acid and Amino Acid Sequences 

The TatCd polynucleotide having the sequence corresponding to the amino 
acid sequence as shown in Figure 1 or 14 encodes the Bacillus subtilis secretion 
factor TatCd. The Bacillus subtilis TatCd was identified via a FASTA search of 
Bacillus subtilis translated genomic sequences using a consensus sequence of 
TatC derived from E.coli. A FASTA search of Bacillus subtilis translated genomic 
sequences with the E.co//TatC sequence alone did not identify the B. subtilis 
TatCd. The present invention provides gram-positive tatCd polynucleotides which 
may be used alone or together with other secretion factors in a gram-positive host 
cell for the purpose of increasing the secretion of desired heterologous or 
homologous proteins or polypeptides. 

The present invention encompasses tatCd polynucleotide homologs 
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encoding novel gram-positive microorganism tatC whether encoded by one or 
multiple polynucleotides which have at least 80%, or at least 90% or at least 95% 
identity to B. subtilis TatCd as long as the homolog encodes a protein that is able 
to function by modulating secretion in a gram-positive microorganism. As will be 
understood by the skilled artisan, due to the degeneracy of the genetic code, a 
variety of polynucleotides, i.e., tatC polynucleotide variants, can encode the 
Bacillus subtilis secretion factors TatCd. The present invention encompasses all 
such polynucleotides. 

The present invention encompasses novel tatCd polynucleotide homologs 
encoding gram-positive microorganism TatC which has at least 80%, or at least 
90% or at least 95% identity to B.subtilis as long as the homolog encodes a 
protein that has activity in a secretion. 

Gram-positive polynucleotide homologs of B.subtilis tatCd may be obtained by 
standard procedures known in the art from, for example, cloned DNA (e.g., a DNA 
"library"), genomic DNA libraries, by chemical synthesis once identified, by cDNA 
cloning, or by the cloning of genomic DNA, or fragments thereof, purified from a 
desired cell. (See, for example, Sambrook etal., 1989, Molecular Cloning, A 
Laboratory Manual, 2d Ed., Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, New York; Glover, D.M. (ed.), 1985, DNA Cloning: A Practical Approach, 
MRL Press, Ltd., Oxford, U.K. Vol. I, II.) A preferred source is from genomic DNA. 
Nucleic acid sequences derived from genomic DNA may contain regulatory regions 
in addition to coding regions. Whatever the source, the isolated TatCd gene should 
be mdlecularly cloned into a suitable vector for propagation of the gene. 

In the molecular cloning of the gene from genomic DNA, DNA -fragments are 
generated, some of which will encode the desired gene. The DNA may be cleaved 
at specific sites using various restriction enzymes. Alternatively, one may use 
DNAse in the presence of manganese to fragment the DNA, or the DNA can be 
physically sheared, as for example, by sonication. The linear DNA fragments can 
then be separated according to size by standard techniques, including but not limited 
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to, agarose and polyacrylamide gei electrophoresis and column chromatography. 

Once the DNA fragments are generated, identification of the specific DNA 
fragment containing the tatCd may be accomplished in a number of ways. For 
example, a B.subtilis tatCd gene of the present invention or its specific RNA, or a 
fragment thereof, such as a probe or primer, may be isolated and labeled and 
then used in hybridization assays to detect a gram-positive tatC gene. (Benton, 
W. and Davis, R., 1977, Science 196 :180; Grunstein, M. And Hogness, D., 1975, 
Proc. Natl. Acad. Sci. USA 72:3961 ). Those DNA fragments sharing substantial 
sequence similarity to the probe will hybridize under stringent conditions. 

Accordingly, the present invention provides a method for the detection of 
gram-positive TatCd polynucleotide homologs which comprises hybridizing part or 
all of a nucleic acid sequence of B. subtilis tatCd with gram-positive 
microorganism nucleic acid of either genomic orcDNA origin. 

Also included within the scope of the present invention are gram-positive 
microorganism polynucleotide sequences that are capable of hybridizing to the 
nucleotide sequence of B.subtilis tatCd under conditions of intermediate to 
maximal stringency. Hybridization conditions are based on the melting 
temperature (Tm) of the nucleic acid binding complex, as taught in Berger and 
Kimmel (1987, Guide to Molecular Cloning Techniques, Methods in Enzymoloqy, 
Vol 1 52, Academic Press, San Diego CA) incorporated herein by reference, and 
confer a defined "stringency" as explained below. 

Also included within the scope of the present invention are novel gram- 
positive microorganism tatC polynucleotide sequences that are capable of 
hybridizing to part or all of the tatC nucleotide sequence of Figure ? under 
conditions of intermediate to maximal stringency. Hybridization conditions are 
based on the melting temperature (Tm) of the nucleic acid binding complex, as 
taught in Berger and Kimmel (1 987, Guide to Molecular Cloning Techniques . 
Methods in Enzymology, Vol 152, Academic Press, San Diego CA) incorporated 
herein by reference, and confer a defined "stringency" as explained below. 
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"Maximum stringency" typically occurs at about Tm-5°C (5°C below the Tm 
of the probe); "high stringency" at about 5°C to 10°C below Tm; "intermediate 
stringency" at about 1 0°C to 20°C below Tm; and "low stringency" at about 20°C 
to 25°C below Tm. As will be understood by those of skill in the art, a maximum 
stringency hybridization can be used to identify or detect identical polynucleotide 
sequences while an intermediate or low stringency hybridization can be used to 
identify or detect polynucleotide sequence homologs. 

The term "hybridization" as used herein shall include "the process by which 
a strand of nucleic acid joins with a complementary strand through base pairing" 
(Coombs J (1994) Dictionary of Biotechnology , Stockton Press, New York NY). 

The process of amplification as carried out in polymerase chain reaction 
(PCR) technologies is described in Dieffenbach CW and GS Dveksler (1995, PCR 
Primer, a Laboratory Manual , Cold Spring Harbor Press, Plainview NY). A nucleic 
acid sequence of at least about 1 0 nucleotides and as many as about 60 
nucleotides from the TatCd nucleotide sequence of Figure ?, preferably about 12 
to 30 nucleotides, and more preferably about 20-25 nucleotides can be used as a 
probe or PCR primer. 

The B. subtilis tatCd polynucleotide corresponding to the amino acid 
sequence as shown in Figure 1 or 14 encodes B. subtilis TatCd. The present 
invention encompasses novel gram positive microorganism amino acid variants of 
the amino acid sequence shown in Figure 1 or 14 that are at least 80% identical, 
at least 90% identical and at least 95% identical to the sequence shown in Figure 
1 or 14 as long as the amino acid sequence variant is able to function by 
modulating secretion of proteins in gram-positive microorganisms. 

The secretion factor TatCd as shown in Figure 1 was subjected to a 
FASTA (Lipmann Pearson routine) amino acid search against a consensus amino 
acid sequence for TatCd. The amino acid alignment is shown in Figure 1 . 
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Expression Systems 

The present invention provides expression systems for the enhanced 
production and secretion of desired heterologous or homologous proteins in a 
host microorganism. 

I. Coding Sequences 

In the present invention, the vector comprises at least one copy of nucleic 
acid encoding a gram-positive microorganism TatC and/or TatA secretion factor 
and preferably comprises multiple copies. In a preferred embodiment, the gram- 
positive microorganism is Bacillus. In another preferred embodiment, the gram- 
positive microorganism is Bacillus subtilis. In a preferred embodiment, 
polynucleotides which encode B. subtilis TatC and/or TatA, or fragments thereof, 
or fusion proteins or polynucleotide homolog sequences that encode amino acid 
variants of TatC and/or TatA, may be used to generate recombinant DNA 
molecules that direct the expression of TatC and/or TatA, or amino acid variants 
thereof, respectively, in gram-positive host cells. In a preferred embodiment, the 
host cell belongs to the genus Bacillus. In another preferred embodiment, the 
host cell is B.subtilis. 

As will be understood by those of skill in the art, it may be advantageous to 
produce polynucleotide sequences possessing non-naturally occurring codons. 
Codons preferred by a particular gram-positive host cell (Murray E et al (1989) 
Nuc Acids Res 17:477-508) can be selected, for example, to increase the rate of 
expression or to produce recombinant RNA transcripts having desirable 
properties, such as a longer half-life, than transcripts produced from naturally 
occurring sequence. 

Altered gram positive tatC and/or tatA polynucleotide sequences which 
may be used in accordance with the invention include deletions, insertions or 
substitutions of different nucleotide residues resulting in a polynucleotide that 
encodes the same or a functionally equivalent TatC and/or TatA homolog, 
respectively. As used herein a "deletion" is defined as a change in either 
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nucleotide or amino acid sequence in which one or more nucleotides or amino 
acid residues, respectively, are absent. 

As used herein an "insertion" or "addition" is that change in a nucleotide or 
amino acid sequence which has resulted in the addition of one or more 
nucleotides or amino acid residues, respectively, as compared to the naturally 
occurring gram positive TatC and/or TatA. 

As used herein "substitution" results from the replacement of one or more 
nucleotides or amino acids by different nucleotides or amino acids, respectively. 

The encoded protein may also show deletions, insertions or substitutions 
of amino acid residues which produce a silent change and result in a functionally 
equivalent gram-positive TatC and/or TatA variant. Deliberate amino acid 
substitutions may be made on the basis of similarity in polarity, charge, solubility, 
hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues as 
long as the variant retains the ability to modulate secretion. For example, 
negatively charged amino acids include aspartic acid and glutamic acid; positively 
charged amino acids include lysine and arginine; and amino acids with uncharged 
polar head groups having similar hydrophilicity values include leucine, isoleucine, 
valine; glycine, alanine; asparagine, glutamine; serine, threonine, phenylalanine, 
and tyrosine. 

The TatC and/or TatA polynucleotides of the present invention may be 
engineered in order to modify the cloning, processing and/or expression of the 
gene product. For example, mutations may be introduced using techniques which 
are well known in the art, eg, site-directed mutagenesis to insert new restriction 
sites, to alter glycosylation patterns or to change codon preference, for example. 

In one embodiment of the present invention, a TatC and/or TatA 
polynucleotide may be ligated to a heterologous sequence to encode a fusion 
protein. A fusion protein may also be engineered to contain a cleavage site 
located between the TatC and/or TatA nucleotide sequence and the heterologous 
protein sequence, so that the TatC and/or TatA protein may be cleaved and 
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purified away from the heterologous moiety. 
II. Vector Sequences 

Expression vectors used in expressing the secretion factors of the present 
invention in gram-positive microorganisms comprise at least one promoter 
associated with a gram-positive tatC and/or tatA, which promoter is functional in 
the host cell. In one embodiment of the present invention, the promoter is the 
wild-type promoter for the selected secretion factor and in another embodiment of 
the present invention, the promoter is heterologous to the secretion factor, but still 
functional in the host cell. 

Additional promoters associated with heterologous nucleic acid encoding 
desired proteins or polypeptides may be introduced via recombinant DNA 
techniques. In one embodiment of the present invention, the host cell is capable 
of overexpressing a heterologous protein or polypeptide and nucleic acid 
encoding one or more secretion factor(s) is(are) recombinantly introduced. In one 
preferred embodiment of the present invention, nucleic acid encoding TatC and/or 
TatA is stably integrated into the microorganism genome. In another 
embodiment, the host cell is engineered to overexpress a secretion factor of the 
present invention and nucleic acid encoding the heterologous protein or 
polypeptide is introduced via recombinant DNA techniques. The present 
invention encompasses gram-positive host cells that, are capable of 
overexpressing other secretion factors known to those of skill in the art, or other 
secretion factors known to those of skill in the art or identified in the future. 

In a preferred embodiment, the expression vector contains a multiple 
cloning site cassette which preferably comprises at least one restriction 
endonuclease site unique to the vector, to facilitate ease of nucleic acid 
manipulation. In a preferred embodiment, the vector also comprises one or more 
selectable markers. As used herein, the term selectable marker refers to a gene 
capable of expression in the gram-positive host which allows for ease of selection 
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of those hosts containing the vector. Examples of such selectable markers 
include but are not limited to antibiotics, such as, erythromycin, actinomycin, 
chloramphenicol and tetracycline. 

III. Transformation 

In one embodiment of the present invention, nucleic acid encoding one or 
more gram-positive secretion factor(s) of the present invention is introduced into a 
gram-positive host cell via an expression vector capable of replicating within the 
host cell. Suitable replicating plasmids for Bacillus are described in Molecular 
Biological Methods for Bacillus, Ed. Harwood and Cutting, John Wiley & Sons, 
1990, hereby expressly incorporated by reference; see chapter 3 on plasmids. 
Suitable replicating plasmids for B. subtilis are listed on page 92. 

In another embodiment, nucleic acid encoding a gram-positive micro- 
organism tatC and/or tatA stably integrated into the microorganism genome. 
Preferred gram-positive host cells are from the genus Bacillus. Another preferred 
gram-positive host cell is B. subtilis. Several strategies have been described in 
the literature for the direct cloning of DNA in Bacillus. Plasmid marker rescue 
transformation involves the uptake of a donor plasmid by competent cells carrying 
a partially homologous resident plasmid (Contente era/., Plasmid 2555-571 
(1979); Haima etal., Mol. Gen. Genet 223:185-191 (1990); Weinrauch era/., J. 
Bacteriol. 154(3)-A 077-1 087 (1983); and Weinrauch etal., J. Bacterid. 
169(3):! 205-1 21 1 (1987)). The incoming donor plasmid recombines with the 
homologous region of the resident "helper" plasmid in a process that mimics 
chromosomal transformation. 

Transformation by protoplast transformation is described for B. subtilis in 
Chang and Cohen, (1979) Mol. Gen. Genet 168:1 1 1-1 15; for B.megaterium in 
Vorobjeva et al., (1980) FEMS Microbiol. Letters 7:261-263; for B. 
amyloliquefaciensin Smith et al., (1986) Appl. and Env. Microbiol. 51:634; for B. 
thuringiensis in Fisher et al., (1981) Arch. Microbiol. 139:213-217; for B. 
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sphaericus in McDonald (1984) J. Gen. Microbiol. 130:203; and B. larvae in 
Bakhiet et al., (1985) 49:577. Mann et al., (1986, Current Microbiol. 13:131-135) 
report on transformation of Bacillus protoplasts and Holubova, (1985) Folia 
Microbiol. 30:97) disclose methods for introducing DNA into protoplasts using 
DNA containing liposomes. 

Identification of Transformants 

Although the presence/absence of marker gene expression suggests that 
the gene of interest is also present, its presence and expression should be 
confirmed. For example, if the nucleic acid encoding tatC and/or tatA is inserted 
within a marker gene sequence, recombinant cells containing the insert can be 
identified by the absence of marker gene function. Alternatively, a marker gene 
can be placed in tandem with nucleic acid encoding the secretion factor under the 
control of a single promoter. Expression of the marker gene in response to 
induction or selection usually indicates expression of the secretion factor as well. 

Alternatively, host cells which contain the coding sequence for a secretion 
factor and express the protein may be identified by a variety of procedures 
known to those of skill in the art. These procedures include, but are not limited to, 
DNA-DNA or DNA-RNA hybridization and protein bioassay or immunoassay 
techniques which include membrane-based, solution-based, or chip-based 
technologies for the detection and/or quantification of the nucleic acid or protein. 

The presence of the tatC and/or tatA polynucleotide sequence can be 
detected by DNA-DNA or DNA-RNA hybridization or amplification using probes, 
portions or fragments derived from the B.subtilis tatC and/or tatA polynucleotide. 

Secretion Assays 

Means for determining the levels of secretion of a heterologous or 
homologous protein in a gram-positive host cell and detecting secreted proteins 
include, using either polyclonal or monoclonal antibodies specific for the protein. 
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Examples include enzyme-linked immunosorbent assay (ELISA), 
radioimmunoassay (RIA) and fluorescent activated cell sorting (FACS). These 
and other assays are described, among other places, in Hampton R et al (1990, 
Serological Methods, a Laboratory Manual, APS Press, St Paul MN) and Maddox 
DE et al (1 983, J Exp Med 1 58: 1 21 1 ). 

A wide variety of labels and conjugation techniques are known by those 
skilled in the art and can be used in various nucleic and amino acid assays. 
Means for producing labeled hybridization or PCR probes for detecting specific 
polynucleotide sequences include oligolabeling, nick translation, end-labeling or 
PCR amplification using a labeled nucleotide. Alternatively, the nucleotide 
sequence, or any portion of it, may be cloned into a vector for the production of an 
mRNA probe. Such vectors are known in the art, are commercially available, and 
may be used to synthesize RNA probes in vitro by addition of an appropriate RNA 
polymerase such as T7, T3 or SP6 and labeled nucleotides. 

A number of companies such as Pharmacia Biotech (Piscataway NJ), 
Promega (Madison Wl), and US Biochemical Corp (Cleveland OH) supply 
commercial kits and protocols for these procedures. Suitable reporter molecules 
or labels include those radionuclides, enzymes, fluorescent, chemiluminescent, or 
chromogenic agents as well as substrates, cofactors, inhibitors, magnetic 
particles and the like. Patents teaching the use of such labels include US Patents 
3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149 and 
4,366,241 . Also, recombinant immunoglobulins may be produced as shown in 
US Patent No. 4,81 6,567 and incorporated herein by reference. 
Purification of Proteins 

Host cells transformed with polynucleotide sequences encoding 
heterologous or homologous protein may be cultured under conditions suitable for 
the expression and recovery of the encoded protein from cell culture. The protein 
produced by a recombinant host cell comprising a secretion factor of the present 
invention will be secreted into the culture media. Other recombinant 
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constructions may join the heterologous or homologous polynucleotide sequences 
to nucleotide sequence encoding a polypeptide domain which will facilitate 
purification of soluble proteins (Kroll DJ et al (1993) DNA Cell Biol 12:441-53). 

Such purification facilitating domains include, but are not limited to, metal 
chelating peptides such as histidine-tryptophan modules that allow purification on 
immobilized metals (Porath J (1992) Protein Expr Purif 3:263-281), protein A 
domains that allow purification on immobilized immunoglobulin, and the domain 
utilized in the FLAGS extension/affinity purification system (Immunex Corp, 
Seattle WA). The inclusion of a cleavable linker sequence such as Factor XA or 
enterokinase (Invitrogen, San Diego CA) between the purification domain and the 
heterologous protein can be used to facilitate purification. 

In the present studies, we demonstrate for the first time that a functional 
Tat pathway, required for secretion of the PhoD protein, exists in the Gram- 
positive eubacterium B. subtilis. The TatCd protein, specified by one of the two 
tatC genes of B. subtilis, plays a critical role in this secretion pathway. In contrast, 
the TatCy protein appears to be of minor importance for PhoD secretion. Even 
though no particular function for TatCy was identified, our results show that the 
corresponding gene is transcribed under conditions of phosphate starvation when 
TatCd fulfils its critical role in PhoD secretion. Furthermore, as inferred from the 
fact that low levels of PhoD secretion by B. subtilisAtatCd (but never by tatCd- 
tatCy double mutants) were observed in some experiments, TatCy seems to be 
actively involved in RR-pre-protein translocation. Notably, these observations 
imply that TatC is a specificity determinant for protein secretion via the Tat 
pathway. In fact, our observation that the secretion of PhoD was increased in the 
absence of TatCy suggests that abortive interactions between pre-PhoD and 
TatCy or TatCy-containing translocases can occur. Nevertheless, alternative, 
more indirect explanations for this observation can presently not be excluded. 
Interestingly, the positive effect of the tatCy mutation on PhoD secretion is 
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reminiscent of the effect that was observed when certain genes (i.e. sipS and/or 
s/pU) for paralogous type I signal peptidases of B. subtilis were disrupted. This 
resulted in significantly improved rates of processing of the oc-amylase AmyQ 
precursor by the remaining type I signal peptidases (i.e. SipT, SipV and/or SipW; 
Tjalsma et al. (1998) Genes Dev. 12, 2318-2331, Tjalsma et al. (1997) J. Biol 
Chem. 272, 25983-25992, and Bolhuis et al. (1996). Mo\. Microbiol 22, 605-618). 
Taken together, these observations suggest that, in general, the presence of two 
or more paralogous secretion machinery components in B. subtilis may result in, 
as yet undefined, abortive interactions with certain secretory pre-proteins. 

The PhoD protein of B. subtilis is synthesized with a typical RR-signal 
peptide that contains a long hydrophilic N-region with a consensus RR-motif, and 
a mildly hydrophobic H-region (Table I). In fact, the RR-signal peptide of PhoD 
contains no detectably atypical features for RR-signal peptides (see: Berks, B. C. 
(1996) Mo\. Microbiol 22, 393-404) and, therefore, it is presently not clear why 
PhoD specifically requires the presence of TatCd for efficient secretion. 
Strikingly, the secretion of YdhF, the only other protein with a predicted RR-signal 
peptide that could, so far, be identified through 2D-gel electrophoresis, was not 
affected in the AtatCd-AtatCy mutant. This observation shows that the RR-motif 
in the YdhF signal peptide does not direct this protein into the Tat pathway. 
Instead, YdhF is, most likely, secreted via the Sec pathway, which could be due to 
the relatively short, but highly hydrophobic, H-region of the YdhF signal peptide. 
Similarly, the WapA and WprA proteins of B. subtilis, which have predicted RR- 
signal peptides (Table I), were recently shown to be secreted in a strongly Ffh- 
and SecA-dependent manner (Hirose et al. (2000) Microbiology 146, 65-75), 
which implies that these proteins do not use the Tat pathway. Even though the H- 
regions of these signal peptides are of similar size as that of the PhoD signal 
peptide, they are significantly more hydrophobic. The latter observation suggests 
that, like in £. coli (Cristobal et al. (1999) EMBOJ. 18, 2982-2990), the 
hydrophobicity of the H-region is an important determinant that allows the cell to 
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discriminate between Sec-type and RR-signal peptides. Notably, the predicted 
RR-motifs of WapA, WprA and YdhF are also different from previously described 
RR-signal peptides, because they contain Lys or Ser residues at the +3 position 
relative to the twin-arginines (Table I). In fact, hydrophilic residues are completely 
absent from the +2 and +3 positions, relative to the twin-arginines of known RR- 
signal peptides (Berks, B. C. (1996) Mol Microbiol 22, 393-404, Brink et al. 
(1 99.8) FEBS Lett. 434, 425-430, Sargent et al. (1 998) EMBO J. 17, 3640-3650, 
Chaddock et al. (1 995) EMBO J. 14, 271 5-2722, Sargent et al. (1 999) J. Biol 
Chem. 274, 36073-36082, and Santini et al., (1998) EMBO J. 17, 101-1 12). If low 
overall hydrophobicity and the presence of hydrophobic residues at the +2 and +3 
positions are used as criteria for the prediction of RR-signal peptides, the total 
number of predicted B. subtilis signal peptides of this type can be reduced from 
27 to 1 1 . Notably, of these 1 1 pre-proteins, 4 contain additional transmembrane 
segments, and 1 lacks a signal peptidase cleavage site. Thus, based on these 
more stringent criteria, one would predict that merely 6 proteins of B. subtilis (i.e. 
AlbB, LipA, PhoD, YkpC, YkuE, and YwbN) are secreted into the growth medium 
via the Tat pathway. This would explain why the secretion of only one protein, 
PhoD, was detectably affected in B. subtilis AtatCd-AtatCy under conditions of 
phosphate starvation. In this respect, it is important to note that TatC-dependent 
secretion of some other proteins with (predicted) RR-signal peptides may have 
remained unnoticed in the present studies, because they are expressed at very 
low levels under conditions of phosphate starvation. Furthermore, it is 
conceivable that other TatC-dependent proteins were missed in the 2D-gel 
electrophoretic analysis, due to their poor separation in the first dimension. 

Interestingly, the YdhF protein was also predicted to be a lipoprotein (Table 
I; Tjalsma et al. (1999) J. Biol Chem. 274, 1698-1707). The fact that YdhF was 
found in the growth medium either suggests that this prediction was wrong, or that 
YdhF is released into the growth medium via a secondary processing event that 
follows cleavage by the lipoprotein-specific (type II) signal peptidase (Pragai et al. 
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(1997) Microbiology 143, 1327-1333). Such secondary processing events have 
been described previously for other Bacillus lipoproteins (see: Tjalsma et al. 
(1999) J. Biol Chem. 274, 1698-1707). In fact, the latter possibility most likely 
explains why the phosphate-binding protein PstS, which is a typical lipoprotein 
(previously known as YqgG; Tjalsma et al. (1999) J. Bid. Chem. 274, 1698-1707, 
and Qi, Y., and Hulett, F. M. (1998) J. Bacterial 180, 4007-4010), was found in 
the growth medium. As expected for lipoproteins, significant amounts of PstS 
were also present in a cell-associated form (Antelmann, H., Scharf, C, and 
Hecker,M.., (2000) J. Bacterid in press, and Eymann et al. (1996) Microbiology 
142,3163-3170). 

One of the outstanding features of the Tat pathway of E. coli is its ability to 
translocate fully-folded proteins that bind cofactors prior to export from the 
cytoplasm, and even multimeric enzyme complexes (Berks, B. C. (1996) Mol. 
Microbiol 22, 393-404, Weiner et a;. (1998) Ce//93, 93-101 , Santini et al. (1998) 
EMBOJ. 17, 101-112, and Rodrigue etal. (1999) J. Bid. Chem. 274, 13223- 
1 3228). Similarly, the thylakoidal Tat pathway has been shown to translocate 
folded proteins (Bogsch et al. (1997) EMBO J. 16, 3851-3859, and Hynds et al. 

(1998) J. Biol Chem. 273, 34868-34874). Thus, it seems as if this pathway is 
used for the transport of proteins that are Sec-incompatible, either because they 
must fold before translocation, or because they fold too rapidly or tightly to allow 
transport via the Sec-system, which is known to transport proteins in an unfolded 
conformation (see: Dalbey, R. E., and Robinson, C. (1999) Trends Biochem. Sci. 
24, 17-22). Consistent with this idea, folded pre-proteins, some of which were 
biologically active, were shown to accumulate in tat mutants of E. coli (Sargent et 
al. (1998) EMBO J. 17, 3640-3650, Bogsch et al. (1998) J. Bid. Chem. 273, 
18003-18006, Weiner et al. (1998) Cell 93, 93-101 , and Sargent et al. (1999) J. 
Bid. Chem. 274, 36073-36082). Therefore, it is conceivable that the Tat pathway 
of B. subtilis is also involved in the transport of folded cofactor-binding proteins. 
This view is supported by the observation that the iron-sulfur cluster-binding 
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Rieske protein QcrA of B. subtilis (Yu et al. (1995) J. Bacterial 177, 6751-6760) is 
synthesised with a predicted RR-signal peptide (Table I). Nevertheless, 
compared to the parental strain, pre-PhoD accumulation was not increased in B. 
subtilis AtatCd-AtatCy. This suggests that pre-PhoD is either not folded prior to 
translocation, or that folded pre-PhoD is sensitive to cytosoiic proteases of B. 
subtilis. We favor the first possibility, because most native B. subtilis proteins are 
highly resistant to proteolysis, provided that they are properly folded (see: 
Stephenson et al. (1998) Appl Environ. Microbiol 64, 2875-2881 , Bolhuis et al. 
(1999) J. Bid. Chem. 274, 15865-15868, and Bolhuis et al. (1999) Appl Environ. 
Microbid. 65, 2934-2941). Consistent with the idea that pre-PhoD could be 
secreted in a loosely folded or unfolded conformation is the observation that 
loosely folded proteins can be transported via the thylakoidal Tat pathway 
(Bogsch et al. (1997) EMBOJ. 16, 3851-3859, and Hynds et al. (1998) J. Biol 
Chem. 273, 34868-34874). Strikingly, the four known homologues of PhoD, all of 
which were identified in Streptomyces species, are synthesised with a typical RR- 
signal peptide (Table IV). Thus it seems that PhoD-like proteins belong to a novel 
family of proteins with an as yet undefined requirement for translocation via the 
Tat pathway. In this respect, it is interesting to note that the N-regions of the RR- 
signal peptides of PhoD and PhoD-like proteins are among the longest N-regions 
of known RR-signal peptides (see: Berks, B. C. (1996) Md. Microbid. 22, 393- 
404). 

Finally, one of the most striking results of our present studies is the 
observation that TatC is a specificity determinant for protein secretion via the Tat 
pathway of B. subtilis. Interestingly, this finding questions to some extent the 
hypothesis that the TatA-like components of this pathway have a receptor-like 
function (Chanal et al. (1998) Md. Microbid. 30, 674-676, and Settles et al. 
(1997) Science 278, 1467-1470). Instead, it suggests that TatC-like proteins 
recognise specific elements of certain exported proteins, such as the RR-signal 
peptide. Thus, our results might represent the first experimental support for the 
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'sea anemone' model of Berks et. al. (Mo\. Microbiol (2000) 5, 260-274) in which, 
on the basis of theoretical considerations, it is proposed that the TatABE proteins 
form a protein-conducting channel, while the TatC protein acts as an RR-signal 
peptide receptor. Alternatively, it is still conceivable that certain proteins with RR- 
signal peptides are recognized by TatA-like proteins, provided that a specific 
TatC-like partner protein is present. A third possibility would be that specific TatA- 
and TatC-like partner proteins are jointly involved in substrate recognition. The 
fact that neither TatAc nor TatAd of B. subtilis were able to complement tatA, tatB 
or tatE mutations in E. coh, and that TatCd of B. subtilis was unable to 
complement the E. coli tatC mutation (our unpublished observations), suggests 
that the TatC-determined pathway specificity, as described in the present studies, 
is based on specific interactions between TatA- and TatC-like proteins. If so, this 
implies that B. subtilis contains two parallel routes for twin-arginine translocation, 
one of which involves the TatCd protein. As shown in the present studies, the 
TatCd-dependent translocation appears to be activated specifically under 
conditions of phosphate starvation, perhaps with the sole purpose of translocating 
PhoD. Similar to the situation in B. subtilis, parallel routes for twin-arginine 
translocation may be present in other organisms, such as Archaeoglobus 
fulgidus, which was shown to contain two paralogous fafC-like genes (Berks et al. 
(2000) Mo\. Microbiol 5, 260-274, and Klenk ef al. (1997) Nature 390, 364-370). 

Additional work carried out in support of the present invention indicates that 
both tatCd and tatCy may be TAT components and responsible for secretion of other 
genes as well. In fact, with reference to Figure 6, a tatCd deletion totally abolishes 
the secretion of LipA. Figure 6 however suggests also that, while TatCd is the 
primary TAT component, TatCy plays some role on the secretion of LipA (although 
not as stringent as TatCd). 

The bacterial twin-arginine translocation (Tat) pathway has been recently 
described for PhoD of Bacillus subtilis, a phosphodiesterase containing a twin- 
arginine signal peptide. The expression of phoD, induced in response to 
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phosphate depletion, is co-regulated with expression of tatAd and tatCd genes 
localized downstream of phoD. While tatCd was of major importance for the 
secretion of PhoD, the second copy of a tatC (tatC y ) was not required for this 
process. To characterise specificity of PhoD transport further, translocation of 
PhoD was investigated in E. coli. Using gene fusions, we analysed the particular 
role of the signal peptide and the mature region of PhoD in canalising the 
transport route. A hybrid protein consisting of the signal peptide of TEfvUp- 
lactainase and mature PhoD was transported Sec-dependent indicating that the 
mature part of PhoD does not contain information canalising the selected 
translocation route. PrePhoD as well as a fusion protein consisting of the signal 
peptide of PhoD (SP P h 0 D) and 0-galactosidase (LacZ) remained cytosolic in the 
Escherichia coli. Thus, SP P h 0 D appears to be not recognised by E. coli transport 
systems. Co-expression of B. subtilis tatAJCd genes resulted in the processing of 
SPphoD-LacZ and periplasmic localisation of LacZ illustrating a close substrate-Tat 
component specificity of the PhoD-TatA d /Cd transport system. While blockage of 
the Sec-dependent transport did not affect the localisation of SP P h 0 D-LacZ, 
translocation and processing was dependent on the pH gradient of the cytosolic 
membrane. TatAd/Cd-mediated transport of SP P h 0 D-LacZ was observed in 
absence of the E. colilaX proteins indicating SP P h 0 D-peptides and its adopted 
TatAd/Cd protein pair form an autonomous Tat system in E. coli. Thus, the 
minimal requirement of an active Tat-dependent protein translocation system 
consists of a twin-arginine signal peptide containing Tat substrate, its specific 
TatA/C proteins and the pH-gradient across the cytosolic membrane. 

The following preparations and examples are given to enable those skilled 
in the art to more clearly understand and practice the present invention. They 
should not be considered as limiting the scope and/or spirit of the invention, but 
merely as being illustrative and representative thereof. 

In the experimental disclosure which follows, the following abbreviations 
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apply: eq (equivalents); M (Molar); uM (micromolar); N (Normal); mol (moles); 
mmol (millimoles); umol (micromoles); nmol (nanomoles); g (grams); mg 
(milligrams); kg (kilograms); ug (micrograms); L (liters); ml (milliliters); ul 
(microliters); cm (centimeters); mm (millimeters); urn (micrometers); nm 
(nanometers); °C. (degrees Centigrade); h (hours); min (minutes); sec (seconds); 
msec (milliseconds); TLC (thin layer achromatography); TY, trypton/yeast extract; 
Ap, ampicillin; DTT, dithiotreitol; Em, erythromycin; HPDM, high phosphate 
defined medium; IPG, immobilized pH gradient; IPTG, isopropyl-p-D- 
thiogalactopyranoside; Km, kanamycin; LPDM, low phosphate defined medium; 
MM, minimal medium; OD, optical density; PAGE, polyacrylamide gel 
electrophoresis; PCR, polymerase chain reaction; Sp, spectinomycin; SSM, 
Schaeffer's sporulation medium; 2D, two-dimensional. 

Example 1 

Identification of tat genes of B. subtilis 
In order to investigate whether B. subtilis contains a potential Tat pathway, 

a search for homologues of E. coiilaX proteins was performed, using the 

complete sequence of the B. subtilis genome (Kunst efal. (1997) Nature 390, 

249-256). First, sequence comparisons revealed that B. subtilis contains three 

paralogous genes (ie. yczB, ydil and ynzA) that specify proteins with sequence 

similarity to the three paralogous E. coli Tat A, TatB and TatE proteins. 

Specifically, the Ydil protein (57 residues), which was renamed TatAy, showed 

the highest degree of sequence similarity with the E. co//TatA protein (58% 

identical residues and conservative replacements); the YczB protein (70 

residues), which was renamed TatAd, showed the highest degree of sequence 

similarity with the E. coli TatB protein (54% identical residues and conservative 

replacements); and the YnzA protein (62 residues), which was renamed TatAc, 

showed the highest degree of sequence similarity with the E. coli TatB protein 

(53% identical residues and conservative replacements). All three B. subtilis 

proteins were renamed TatA to avoid possible mis-interpretations with respect to 
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their respective functions, which are presently unknown. Like TatA, TatB, and 
TatE of E. co/i, the three TatA proteins of B. subtilis appear to have one amino- 
terminal membrane spanning domain (Fig. 1A), and the carboxyl-terminal parts of 
these proteins are predicted to face the cytoplasm. Even though TatAc, TatAd 
and TatAy of B. subtilis show significant similarity to TatA, TatB and TatE of E. 
coli when the amino acid sequences of these proteins are compared pairwise, 
only a limited number of residues is conserved in all six amino acid sequences 
(17% identical residues and conservative replacements; Fig. 1A). 

Second, in contrast to E. coh, which contains a unique fafCgene (10), B. 
subtilis was shown to contain two paralogous fa?C-like genes (ie. ycbT and ydU). 
The YcbT protein (245 residues), which was renamed TatCd, and the YdiJ protein 
(254 residues), which was renamed TatCy, showed significant similarity to the E. 
coh TatC protein (57% identical residues and conservative replacements in the 
three aligned sequences; Fig. 1B). Like TatC of £. coJi, TatCd and TatCy of B. 
subtilis have six potential transmembrane segments (Fig. 1B), and the amino- 
termini of these proteins are predicted to face the cytoplasm (data not shown). 

In contrast to E. coh, in which the tatA, tatB and tatC genes form one 
operon while the fafEgene is monocistronic (Sargent et al. (1998) EMBOJ. 17, 
3640-3650), the tat genes of B. subtilis are located at three distinct chromosomal 
regions. Two of these regions contain adjacent tatA and tatC genes, the tatAd 
and tatAy genes being located immediately upstream of the fafCdand tatCy 
genes, respectively (Fig. 2). Strikingly, the tatAd and tatCd genes, which map at 
24.4 o on the B. subtilis chromosome, are located immediately downstream of the 
phoD gene, specifying a secreted protein with a putative RR-signal peptide (Table 
I). Furthermore, the tafAyand tatCy genes are located at 55.3° on the B. subtilis 
chromosome, within a cluster of genes with unknown function (Fig. 2), and the 
fafAcgene is located at 162.7° on the B. subtilis chromosome (data not shown), 
immediately downstream of the cotC gene specifying a spore coat protein 
(Donovan et al. (1987) J. Mo\. Biol 196, 1-10). Finally, a fafD-like gene, denoted 
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yabD, is located at 4.1° on the B. subtilis chromosome, immediately downstream 
of the metS gene encoding a methionyl-tRNA synthetase (data not shown). 

Taken together, these observations strongly suggest that B. subtilis has a 
Tat pathway for the translocation of proteins with RR-signal peptides across the 
cytoplasmic membrane. Furthermore, the observation that the tatAd and tatCd 
genes are located downstream of the phoD gene, which is a member of the pho 
regulon (Eder et al. (1996) Microbiology 142, 2041-2047), suggests that the tatAd 
and tatCd genes might be exclusively expressed under conditions of phosphate 
starvation. 

Example 2 

TatC-dependent secretion of the PhoD protein 
To investigate whether an active Tat pathway exists in B. subtilis, various 

single and double tatC mutants were constructed. To this purpose, the tatCd 

gene was either disrupted with a Km resistance marker, or it was placed under 

the control of the IPTG-dependent Pspac promoter of plasmid pMutin2, resulting 

in the B. subtilis strains AtefCdand \tatCd, respectively (Fig. 3, A and B). 

Similarly, the tatCy gene was either disrupted with an Sp resistance marker, or it 

was placed under the control of the IPTG-dependent Pspac promoter of plasmid 

pMutin2, resulting in the B. subtilis strains AfafCyand \tatCy, respectively (Fig. 3, 

A and C). Double tatCd-tatCy mutants were constructed by transforming the 

AtatCy mutant with chromosomal DNA of the AfafCtfor \tatCd mutant strains. 

Table II lists the plasmids and bacterial strains used. TY 1 medium 

(tryptone/yeast extract) contained Bacto tryptone (1%), Bacto yeast extract (0.5%) 

and NaCI (1%). Minimal medium (MM) was prepared as described in Tjalsma et 

al. (1998) Genes Dev. 12, 2318-2331. Schaeffer's sporulation medium (SSM) 

was prepared as described in Schaefferet al. (1965) Proc. Nati. Aca6. Sci. USA 

271, 5463-5467. High phosphate (HPDM) and low phosphate (LPDM) defined 

media were prepared as described in Muller et al. (1997) Microbiology 143, 947- 

956. To test anaeorobic growth, S7 medium was prepared as described in van 
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Dijl et al. (1991) J. Gen. Microbiol 137, 2073-2083 and van Dijl et al. (1991) Mol 
Gen. Genet. 227, 40-48 and supplemented with NaN03 (0.2%) and glycerol (2%). 
When required, media for E. coli were supplemented with ampicillin (Ap; 100 
|ig/ml), erythromycin (Em; 100 \ig/m\), kanamycin (Km; 40 |xg/ml), or 
spectinomycin (Sp; 1 00 |ig/ml); media for B. subtilis were supplemented with Em 
(1 |ig/ml), Km (10 ug/ml), Sp (100 u.g/ml), and/or isopropyl-p-D- thiogalacto- 
pyranoside (IPTG; 100 u.M). 

Procedures for DNA purification, restriction, ligation, agarose gel 
electrophoresis, and transformation of E. coli were carried out as described in 
Sambrook et al. (1 989) Molecular Cloning: A laboratory Manual, 2nd Ed., Cold 
Spring HarborLaboratory, Cold Spring Harbor, NY. Enzymes were from Roche 
Molecular Biochemicals. B. subtilis was transformed as described in Tjalsma et 
al. (1997) J. Bid. Chem. 272, 25983-25992. PCR (polymerase chain reaction) 
was carried out with the Pwo DNA polymerase (New England Biolabs) as 
described in van Dijl et al. (1 995) J. Bid. Chem. 270, 361 1 -361 8. 

To construct B. subtilis \tatCd, the 5' region of the tatCd gene was 
amplified by PCR with the primers JJ14bT (5'-CCC AAG CTT ATG AAA GGG 
AGG GCT TTT TTG AAT GG-3') containing a H/7idlll site, and JJ15bT (5'-GCG 
GAT CCA AAG CTG AGC ACG ATC GG-3') containing a BamHI site. The 
amplified fragment was cleaved with H/ndlll and BamHI, and cloned in the 
corresponding sites of pMutin2 (Vagneret al. (1998) Microbiol 144, 3097-3104), 
resulting in pMICdl . B. subtilis UatCd was obtained by a Campbell-type 
integration (single cross-over) of pMICdl into the tatCd region of the 
chromosome. 

To construct B. subtilis \tatCy, the 5' region of the tatCy gene was amplified 
by PCR with the primers JJ03U (5'-CCC AAG CTT AAA AAG AAA GAA GAT 
CAG TAA GTT AGG ATG-3') containing a H/ndlll site, and JJ04iJ (5'-GCG GAT 
CCA AGT CCT GAG AAA TCC G-3') containing a BamH\ site. The amplified 
fragment was cleaved with H/ndlll and BamHI, and cloned in the corresponding 
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sites of pMutin2, resulting in pMICyl . B. subtilis UatCy was obtained by a 
Campbell-type integration (single cross-over) of pMICyl into the tatCy region of 
the chromosome. 

To construct B. subtilis AtatCd, the tatCd gene was amplified by PCR with 
primer JJ33Cdd (5'-GGA ATT CGT GGG ACG GCT ACC-3') containing an EcoRI 
site and 5' sequences of tatCd, and primer JJ34Cdd (5'-CGG GAT CCA TCA 
TGG GAA GCG-3') containing a BamHI site and 3' sequences of tatCd. Next, the 
PCR-amplified fragment was cleaved with EcoRI and BamHI and ligated into the 
corresponding sites of pUC21 , resulting in pJCdl . Plasmid pJCd2 was obtained 
by replacing an internal Bd\-Acc\ fragment of the fafCdgene in pJCdl with a 
pDG792-derived Km resistance marker, flanked by BamH\ and Cla\ restriction 
sites. Finally, B. subtilis AtatCd was obtained by a double cross-over 
recombination event between the disrupted fafCdgene of pJCd2 and the 
chromosomal tatCd gene. 

To construct B. subtilis AtatCy, the tatCy gene was amplified by PCR with 
primer JJ29Cyd (5'-GGG GTA CCG GAA AAC GCT TGA TCA GG-3') containing 
a Kpnl site and 5' sequences of tatCy, and primer JJ30Cyd (5'-CGG GAT CCT 
TTG GGC GAT AGC C-3') containing a BamHI site and 3' sequences of tatCy. 
Next, the PCR-amplified fragment was cleaved with Kpn\ and BamHI and ligated 
into the Aspl\ 8 and BamHI sites of pUC21 , resulting in pJCyl . Plasmid pJCy2 
was obtained by ligating a pDG1726-derived Sp resistance marker, flanked by 
Psfl restriction sites, into the unique Psfl site of the fafCy gene in pJCyl . Finally, 
B. subtilis AtatCy was obtained by a double cross-over recombination event 
between the disrupted tatCy gene of pJCy2 and the chromosomal tatCy gene. 

Double tatCd-tatCy mutants were constructed by transforming the AtatCy 
mutant with chromosomal DNA of the AtatCd or ItatCd mutant strains. Correct 
integration of plasmids or resistance markers into the chromosome of B. subtilis 
was verified by Southern blotting. The BLAST algorithm (Altschul et al. (1997) 
Nucleic Acids Res. 25, 3389-3402) was used for protein comparisons in 
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GenBank. Protein sequence alignments were carried out with the ClustalW 
program (Thompson et al. (1994) Nucleic Acids Res. 22, 4673-4680), using the 
Blosum matrices, or version 6.7 of the PCGene Analysis Program (Intelligenetics 
Inc.). Putative transmembrane segments, and their membrane topologies were 
predicted with the TopPred2 algorithm (Sipos et al. (1993) Eur. J. Biochem. 213, 
1333-1340 and Cserzo et al. (1997) Protein Eng. 10, 673-676). 

Competence and sporulation- Competence for DNA binding and uptake 
was determined by transformation with plasmid or chromosomal DNA (Bron et al. 
(1972) MutaX. Res. 15, 1-10). The efficiency of sporulation was determined by 
overnight growth in SSM medium, killing of cells with 0.1 volume chloroform, and 
subsequent plating. 

Western blot analysis and immunodetection- To detect PhoB and 
PhoD, B. subtilis cells were separated from the growth medium by centrifugation 
(2 min, 14.000 rpm, room temperature). Proteins in the growth medium were 
concentrated 20-fold upon precipitation with trichloroacetic acid, and samples for 
SDS polyacrylamide gel electrophoresis (PAGE) were prepared as. described 
previously in Laemmli, U. K. (1970) Nature 227, 680-685. After separation by 
SDS-PAGE, proteins were transferred to a nitrocellulose membrane (Schleicher 
and Schull) as described in Towbin et al. (1 979) Proc. A/afl. dead. Sci. USA 76, 
4350-4354. PhoB and PhoD were visualized with specific antibodies (Miiller, J. 
P., and Wagner, M. (1999) FEMS Microbiol. Left. 180, 287-296) and alkaline 
phosphatase-conjugated goat anti-rabbit antibodies (SIGMA) according to the 
manufacturer's instructions. 

Two-dimensional (2D) gel electrophoresis of secreted proteins. B. 
subtilis strains were grown at 37°C under vigorous agitation in 1 litre of a synthetic 
medium (Antelmann et al. (1997) J. Bacteriol 179, 7251-7256, and Antelmann et 
al., (2000) J. Bacteriol in press) containing 0.16 mM KH 2 P0 4 to induce a 
phosphate starvation response. After 1 hour of post-exponential growth, cells 
were separated from the growth medium by centrifugation. The secreted proteins 
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in the growth medium were precipitated overnight with ice-cold 10% trichloroacetic 
acid, and collected by centrifugation (40000 g, 2 h, 4°C). The pellet was washed 3 
times with 96% ethanol, dried and resuspended in 400 of rehydration solution 
containing 2 M thiourea, 8 M urea, 1% Nonidet P40, 20 mM DTT and 0.5% 
Pharmalyte (pH 3-1 0). Cells were disrupted by sonication as described in 
Eymann et al. (1996) Microbiology 142, 3163-3170, and cellular proteins were 
resuspended in rehydration solution as described above. Samples of secreted or 
cellular proteins in rehydration solution were used for the re-swelling of 
immobilized pH gradient (IPG) strips (pH range 3-10). Next, protein separation in 
the IPG strips (first dimension electrophoresis) was performed as recommended 
by the manufacturer (Amersham Pharmacia Biotech). Electrophoresis in the 
second dimension was performed as described in Bernhardt et al. (1997) 
Microbiology 143, 999-1017. The resulting 2D-gels were stained with silver nitrate 
(Blum et al. (1987) Electrophoresis 8, 93-99) or Coomassie Brilliant Blue R250. 

Protein identification. In-gel tryptic digestion of proteins, separated by 2D 
gel electrophoresis, was performed using a peptide-collecting device (Otto et al. 
(1996) Electrophoresis 17, 1643-1650). To this purpose, 0.5 ul peptide solution 
was mixed with an equal volume of a saturated a-cyano-4-hydroxy cinnamic acid 
solution in 50% acetonitrile and 0.1%trifluoroacetic acid. The resulting mixture 
was applied to the sample template of a matrix-assisted laser 
desorption/ionization mass spectrometer (Voyager DE-STR, PerSeptive 
Biosystems). Peptide mass fingerprints were analysed using the 'MS-if software, 
as provided by Baker and Clausner through http://prospector.ucsf.edu. 

The fact that double tatCd-tatCy mutants could be obtained shows that 
TatC function is not essential for viability of B. subtilis, at least not when cells are 
grown aerobically in TY or minimal medium at 37°C, or anaerobically in S7 
medium, supplemented with NaNC-3 (0.2%) and glycerol (2%) at 37°C (data not 
shown). Furthermore, the AtatCd-AtatCy double mutation did not inhibit the 
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development of competence for DNA binding and uptake, sporulation and the 
subsequent spore germination (data not shown), showing that these primitive 
developmental processes do not require TatC function. 

The effects of single and double tatC mutations on protein secretion via the 
Tat pathway were studied using PhoD as a native reporter protein. To this 
purpose, tatC mutant strains were grown under conditions of phosphate 
starvation, using LPDM medium. As shown by Western blotting, the secretion of 
PhoD was strongly reduced in the AtatCd mutant strain and the AtatCd-AtatCy 
double mutant, whereas it was not affected or even improved in the AtatCy 
mutant strain (Fig. 4A). In contrast, the secretion of the alkaline phosphatase 
PhoB, which is dependent of the major (Sec) pathway for protein secretion (49), 
was not affected in the tatC mutants of B. subtilis (Fig. 4B). Notably, in some 
experiments, very low amounts of PhoD were detectable in the growth medium of 
B. subtilis AtatCd (data not shown), but never in that of AtatCd-AtatCy or \tatCd- 
AtatCy double mutants (Fig. 4, A and C). As exemplified with the B. subtilis 
\tatC6-AtatCy double mutant strain, the cells of all tatC mutant strains contained 
similar amounts of pre-PhoD, which were comparable to those in the parental 
strain 1 68 (Fig. 4C; data not shown). Finally, 2D-gel electrophoresis of proteins in 
the medium of phosphate-starved cells of B. subtilis AtatCd-AtatCy or the parental 
strain 1 68 showed that PhoD is the only protein of which the secretion is 
detectably affected by the double tatC mutation under conditions of phosphate 
starvation (Fig. 5). As expected, the secretion of proteins lacking an RR-signal 
peptide, such as the glycerophosphoryl diester phosphodiesterase GIpQ, the 
pectate lyase Pel, the alkaline phosphatases PhoA and PhoB, the phosphate- 
binding protein PstS, the minor extracellular serine protease Vpr, the PBSX 
prophage protein XkdE and the protein with unknown function YncM, was not 
significantly affected by the double tatC mutation. Surprisingly, however, the 
secretion of the YdhF, a protein of unknown function, which does have a potential 
RR-signal peptide (Table I), was also not affected by the disruption of tatCd and 
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tatCy (Fig. 5). Consistent with the above observations, no differences in the 
cellular proteomes of B. subtilis AtatCd-AtatCy and the parental strain 168 could 
be detected by 2D-gel electrophoresis (data not shown). 

In summary, these results show that an active Tat pathway exists in B. 
subtilis, and that TatCd has a critical role in the secretion of PhoD. 

Example 3 

Expression of tatCd and tatCy genes 
To study the expression of the tatCd and tatCy genes, the transcriptional 

tatC6-lacZ and tatty-lacZ gene fusions, present in B. subtilis IfafCdand ItefCy, 

respectively, were used. 

Enzyme activity assays- The assay and the calculation of p- 

galactosidase units (expressed as units per OD600) were carried out as 

described in Miller, J. H. (1982) Experiments in Molecular Biology, Cold Spring 

Harbor Laboratory Press, Cold Spring Harbor NY. Overnight cultures were diluted 

1 00-fold in fresh medium and samples were taken at hourly intervals for OD600 

readings and p-galactosidase activity determinations. Induction of the phosphate 

starvation response was monitored by alkaline phosphatase activity 

determinations as described in Hulett et al. (1990) J. Bacterio\. 172, 735-740. 

As expected, upon a medium shift from high phosphate (HPDM) to low 
phosphate (LPDM) medium in order to induce a phosphate starvation response, 
tatCd transcription could be observed in B. subtilis \tatCd. In this strain, relatively 
low, but constant levels of p-galactosidase production were reached within a 
period of four hours after the change to LPDM medium, while no p-galactosidase 
production was detectable in the parental strain 168 (no lacZ gene fusion present; 
Table II). In contrast, when cells of B. subtilis \tatCdwere grown in minimal (MM), 
sporulation (SSM) or trypton/yeast extract (TY) media, none of which induces a 
phosphate starvation response, no transcription of the tatCd gene was detectable; 
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under these conditions, the (3-galactosidase levels in cells of B. subtilis \tatCd 
were similar to those of the parental strain 168. Completely different results were 
obtained with B. subtilis \tatOj: the tatCy gene was transcribed in all growth media 
tested and, notably, the transcription of tatCy in LPDM medium was much higher 
than that of the tatCd gene (Table III). In contrast to the fafCdgene, the highest 
levels of tatCy transcription were observed in MM and TY medium, while the 
lowest levels of tatCy transcription were observed in SSM medium (Table III). In 
conclusion, these findings show that tatCd is only transcribed under conditions of 
phosphate starvation, in contrast to tatty, which is transcribed under all 
conditions tested. 

Example 4 

PhoD is not transported in E. coli 
Plasmids, bacterial strains and media - Table 5 lists the plasmids and 

bacterial strains used. TY medium (h-yptone/ yeast extract) contained Bacto 

wiptone (1%), Bacto yeast extract (0.5%) and NaCI (1%). For pulse- chase 

labelling experiments M9-Minimal medium was prepared as described (Miller et 

al. (1992) Suppression of the growth and export defects of an Escherichia coli 

secA(Ts) mutant by a gene cloned from Bacillus subtilis. MoL Gen. Genet. 235, 

89-96). When required, media were supplemented with ampicillin (100 ug/ml), 

kanamycin (40 ug/ml), chloramphenicol (20 pg/ml), tetracycline (12.5 ug/ml), 

arabinose (0.2%), isopropyl-p-D-thiogalactopyranoside (IPTG; 100 pM), nigericin 

(1 pM) and/or sodium azide (3 mM). [ 35 S]-Methionine was from Hartman Analytic 

(Braunschweig, Germany), [ 14 C]-labelled molecular weight marker from 

Amersham International (Amersham, Bucks, U.K.) 

DNA techniques - Procedures for DNA purification, restriction, ligation, 

agarose gel electrophoresis, and transformation of E. coli were carried out as 

described in Sambrook et al. Restriction enzymes were from MBI Fermentas. 

PGR (polymerase chain reaction) was carried out with the VENT DNA polymerase 

(New England Biolabs). 
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To construct pAR3phoD, the phoD gene including its ribosome binding site 
was amplified from the chromosome of B. subtilis strain 168 by PCR using the 
primers P1 (5'- GAG GAT CCA TGA GGA GAG AGG GGA TCT TGA ATG GCA 
TAC GAC-3') containing a BamHI site, and P2 (5-CGA TCC TGC AGG ACC TCA 
TCG GAT TGC-3') containing a Psfl site. The amplified fragment was cleaved 
with BamHI and Psti, and cloned in the corresponding sites of pAR3. The 
resulting plasmid pAR3p/?oD allowed the arabinose inducible expression of wild 
type phoD in E. coli. 

To construct a gene fusion between bla and phoD genes, the signal 
sequence less phoD was amplified using primers P3 (5-GTA GGA TCC GCG 
CCT AAC TTC TCA AGC-3 1 ) containing a BamHI site and primer P2 containing a 
Psti site. The amplified fragment was cleaved with BamHI and Psti, and cloned in 
the corresponding sites of pUC19, resulting in plasmid pt)C1 9'p/ioD. Next, the 5' 
region of TEM-B-lactamase encoding its signal sequence was amplified from 
plasmid pBR322 by PCR with primers B1 (5'-ATA GAA TTC AAA AAG GAA GAG 
TAT G-3') containing an EcoRI site, and primer B2 (5'-CTG GGG ATC CAA AAA 
CAG GAA GGC-3") containing a BamHI site. The amplified PCR fragment was 
cleaved with BamHI and EcoRI and inserted into pUC1 9'phoD, cleaved with the 
same restriction enzymes, resulting in plasmid pUC19b/a-p/7oD. For easy 
selection of recombinant clones plasmid pOR124, containing a tetracycline 
resistance gene was inserted 3' of the bla-phoD gene fusion using an unique Psti 
site. From the resulting plasmid pUC19b/a-p/7oD-Tc an EcoRI -Bg/ll fragment 
containing bla- phoD and the tetracycline resistance gene of pOR124 was 
isolated and inserted into pMUTIN2 cleaved with EcoRI and BamHI. At plasmid 
pMutin2b/a-p/7oD the bla-phoD gene fusion is under control of the IPTG-inducible 
Pspac promoter. 

To construct a gene fusion consisting of the signal sequence of phoD and 
lacZ, a DNA fragment encoding the signal peptide of PhoD and the translational 
start site of phoD was amplified by PCR with primer P1 containing a BamHI site 
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and primer P4 (5-GAG AAG GTC GAC GCA GCA TTT ACT TCA AAG GCC CC- 
3') containing a Sa/I site, and inserted into the corresponding sites of pOR124 
resulting in plasmid pOR1 2AphoD\ Next the lacZ gene lacking nine 5' terminal 
codons was amplified using primers L1 (5'-ACC GGG TCG ACC GTC GTT TTA 
CAA CG-3") containing a SaH site and primer L2 (5'-GGG AAT TCATGG CCT 
GCC CGG TT-3') containing an EcoR\ site and subsequently inserted into the 
corresponding sites of pOR124phoD. The resulting plasmid pOR124phoD-lacZ 
was linearized with SamHI and inserted into pAR3 cleaved with Bgl\\. The 
resulting plasmid pAR3phoD- lacZ allows the arabinose inducible expression of 
the phoD-lacZ gene fusion. 

To obtain a plasmid mediating an inducible overexpression of tatAd tatCdO\ 
B. subtilis, the DNA region containing these genes including their ribosome 
binding sites was amplified by PCR with the primers T1 (5'-CAA GGA TCC CGA 
ATT AAG GAG TGG-3') containing a BamHI site and primer T2 (5-GGT CTG 
CAG CTG CAC TAA GCG GCC GCC-3') containing a Psfl site. The amplified 
fragment was cleaved with BamHI and Pstt and cloned into the corresponding 
sites of pQE9 (QIAGEN), resulting in pQE9 tatAd/C d . 

To obtain TG1 htatABCDE, plasmids pFAT44 and subsequently PFAT126 
covering in-frame deletions of E. coli tatE and tatABCD genes, respectively, were 
transferred to the chromosome of TG1 as described. Mutant strain TG1 
AtatABCDE was verified phenotypically by mutant cell septation phenotype, 
hypersensitivty to SDS and resistance to P1 phages as described (Stanley et al. 
(2001) Escherichia coli strains blocked in Tat-dependent protein export exhibit 
pleiotropic defects in the cell envelope. J. BacterioL 183, 139-144). 

SDS-PAGE and Western blot analysis - SDS-polyacrylamide gel 
electrophoresis (SDS-PAGE) was carried out as described by Laemmli (Laemmli, 
U.K. (1970) Cleavage of structural proteins during assembly of the head of 
bacteriophage T4. Nature, 227, 680-685). After separation by SDS-PAGE, 
proteins were transferred to a nitrocellulose membrane (Schleicher and Schiill) as 
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described by Towbin et al (Towbin et al. (1979) Electrophoretic transfer of 
proteins from polyacrylamide gels to nitrocellulose sheets: procedure and some 
applications. Proc. Nati. Acad. Sci. USA, 76,4350-4354). Proteins were visualised 
using specific antibodies against PhoD (16), LacZ (5PRIME- 3PRIM.E, Boulder, 
USA) and SecB (laboratory collection) and alkaline phosphatase-conjugated goat 
anti-rabbit antibodies (SIGMA) according to the manufacturer's instructions. 

Protein-chase experiments, immunoprecipitation and quantification of 
protein - Pulse- labelling experiments of E. coli strains were performed as 
described earlier (Mililer ET AL. (1992) Suppression of the growth and export 
defects of an Escherichia coli secA(Ts) mutant by a gene cloned from Bacillus 
subtilis. MoL Gen. Genet. 235, 89-96). Cultures were pulse labelled with 100 uCi 
[ 35 S]-methionine, chased with unlabelled methionine and samples were taken at 
the times indicated immediately followed by precipitation with trichloracetic acid 
(0°C). After cell lysis proteins were precipitated with specific antibodies against 
PhoD (Miller, J. P. and Wagner, M. (1999) Localisation of the cell wall-associated 
phosphodiesterase PhoD of Bacillus subtilis. FEMS Microbiol Lett., 180, 287- 
296) or (3-lactamase and B-galactosidase (5PRIME-3PRIME, Boulder, USA). 
Relative amounts of radioactivity were estimated by using a Phospholmager (Fuji) 
and associated image analytical software PC-BAS. 

In vivo protease mapping - In vivo protease mapping was carried out 
according to Kiefer et al. {EMBO J. (1997) 16, 2197-2204). For spheroplast 
formation, cells were grown in TY-medium to exponential growth. For induction of 
gene expression the medium was supplemented with arabinose (0.2 %) and/or 
IPTG (1 mM) for 60 min. After spheroplast formation cells were treated with 
proteinase K (SIGMA), with proteinase K and Triton X-100 or remained untreated. 
Detection of cytosolic SecB revealed the proetinase K resistance of Triton X-100 
untreated spheroplasts. 

Determination of p-galactosidase activity - The assay and the calculation of 
(3- galactosidase units (expressed as units per OD 6 oo) were carried out as 
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described by Miller ((1972) Experiments in molecular genetics. Cold Spring 
Harbor Laboratory, Cold Spring Harbor, N.Y.) using 2-Nitrophenyl-p-D- 
galactopyranoside (ONPG, Serva). Enzymatic activity of the supernatant of 
lysozyme treated spheroplasts reflected the periplasmic space. Activity 
associated to the spheroplasts represented the cytosolic and cytoplasms 
membrane bound activity. 

PhoD is not transported in E. coli - The initial aim was to test whether PhoD 
could be exported by the Tat pathway in E. coli. For this purpose, we placed the 
encoding this peptide under the control of the P B ad promoter of Salmonella 
typhimurium localized at plasmid pAR3. The resulting plasmind allowed the 
arabinose-inducible enzymaticallyactive production of PhoD in E. coli TG1 (data 
not shown). Since phosphodiesterase is highly toxic for the cell physiology of E. 
coli immediately after induction of phoD expression cell growth ceased. In order 
to quantify transport of PhoD in E. co//TG1(pARp/7oD) pulse-chase experiments 
were performed. As shown in Fig. 8 no processing of the wild-type prePhoD was 
observed even after 60 min chase indicating that prePhoD was not translocated 
by the E. coliTaX machinery. Localisation of PhoD was further localised by in vivo 
protease mapping. As shown in figure 8 prePhoD was not accessible to 
Proteinase K at the outer side of the cytosolic membrane, demonstrating that 
PhoD remains in a cytosolic localisation in E. coli TG1 (pARpnoD). 

PhoD can be transported via the Sec-dependent protein translocation 
pathway - Absence of prePhoD processing in E. coli could be due to inefficient 
recognition of the signal peptide of PhoD by the E. co//Tat-machinery or due to 
the nature of the mature part of the PhoD peptide. This B. subtilis protein could 
have unexpected folding characteristics or necessity of co-factors not present in 
E. coli. In order to address this question, the DNA encoding the mature peptide of 
PhoD was fused to the region encoding the signal peptideof TEM-(3-lactamase 
(SP B | a ). The resulting gene fusion was cloned into the pMUTIN2 vector containing 
an IPTG-inducible P S pac promoter allowing the synthesis of the SP B i a -PhoD 
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peptide. The transport and processing of this fusion protein was analysed by 
immunoblotting of whole cell extracts of E. coli strain TG1(pMUTIN2<b/a-pr?oD). 
As shown in Fig. 8A, lane 2, SP B i a -PhoD was completely converted to a protein 
with a molecular weight of mature PhoD indicating the efficient transport of the 
protein. In order to elucidate the export path used for SP B i a -PhoD translocation, 
Sec-dependent transport was selectively inhibited by addition of sodium azide (3 
mM). While presence of sodium azide abolished conversion of SP B i a -PhoD to 
PhoD addition of nigericin did not retard processing of SP B i a -PhoD (Fig. 8A, lanes 
3 and 4). To analyse Sec-dependence of SP B | a -PhoD transport more detailed, 
expression of bla-phoD in E. coli TG 1 (pMUTIN2b/a-pr?oD) was induced in 
presence or absence of sodium azide, pulse-labelled with [ 35 S]-methionine and 
PhoD was subsequently inununoprecipitated. Fig. 8B demonstrates the kinetics 
of conversion of SP B i a -PhoD to mature PhoD. Presence of sodium azide 
significantly retarded maturation of SP B i a -PhoD (Fig. 8C). These data indicate 
that PhoD can be transported in E. coli Sec-dependent. Thus, it can be concluded 
that the signal peptide less PhoD peptide is not canalising the export route and 
does not prevent efficient transport or processing. 

The signal peptide of PhoD can not mediate transport of LacZ in E. coli wild 
type cells - It has been shown that signal peptides containing a twin arginine motif 
can canalise transport of heterologous proteins via the Tat-dependent 
translocation route (reviewed in Wu et al. (2000) Bacterial twin-arginine signal 
peptide-dependent protein translocation pathway: evolution and mechanism. J. 
MoL MicrobioL Biotechnol2, 179-189). The signal peptide of the E. coli TMAO 
reductase (TorA) has been successfully used to mediate Tat-dependent transport 
of the thylakoidal protein 23K, the glucose-fructose oxidoreductase GFOR of 
Zymomonas mobilits and the green fluorescent protein GFP. Other reports 
indicated that Tat-signal peptides could, determine the specificity of the Tat- 
dependent transport (Wu, supra). So could GFOR not be translocated in E. coli 
(28). 
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To test whether the signal peptide of PhoD is recognised by the E. coli Jai 
machinery and could canalise the transport of a protein in E. coli, we constructed 
a gene fusion consisting of the DNA region encoding the 56 amino acid residues 
of PhoD signal peptide (SPphoD) and the lacZ gene encoding (3-galactosidase as a 
reporter protein. The gene hybrid was inserted into plasmid pAR3 resulting in 
plasmid pAR3phoD-lacZ. Induction of production of the SPp h0 D-LacZ fusion 
protein in £ coll TG1 resulted in LacZ + colonies (data not shown). Hence, correct 
folding and tetramerisation of the peptide as a prerequisite for its activity does 
occur in E. coli. 

To analyse if the signal peptide of PhoD could mediate translocation of 
LacZ into an extracytosolic localisation, enzymatic activity of LacZ was monitored 
in E. co//TG1(pAR3p/7oD-/acZJ. As shown in table II the majority of LacZ activity 
remained in the cytosol or the cytosolic membrane. Since absence of enzymatic 
LacZ activity could be a result of inefficient folding rather than absence of 
transport, we next studied localisation of LacZ by using in vivo protease mapping. 
As shown in Fig. 9A no processing of SPp h0 D-LacZ could be observed. The 
SPphoD-LacZ fusion protein was not susceptible to protease digestion in 
spheroplasts. When spheroplasts were destroyed by addition of Triton X-100, the 
unprocessed SPphoD-LacZ protein became protease sensitive (Fig. 9A, lane 3). 
The reliability of the method was verified by using the cytosolic protein SecB as 
internal control (Fig. 9A). In spheroplasts SecB was resistant to proteinase K, but 
was digested after solubilising the spheroplasts with Trition X-100. 

Export of SPphoD-LacZ fusion protein in E. coli needs presence of the B. 
subtilis TatA d and TatC d transport components. - The data demonstrated above 
indicate that the Tat system of £ coli does not mediate transport of prePhoD or of 
the SPphoD-LacZ fusion protein. Absence of translocation could be due to the 
necessity of additional components for the translocation of PhoD present only in 
B. subtilis or due to the specificity of recognition of PhoD as a Tat-dependent 
substrate. Our previous observation that only the TatCd protein but not the 
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second copy of TatC could mediate the Tat-dependent transport in B. subtilis was 
a first indication for a specific recognition of prePhoD. To test this hypothesis, the 
B, subtilis tatAc/Ca gene pair was amplified from the chromosome of B. subtilis 
and inserted under the control of the IPTG-inducible promoter of pQE9 
(QIAGEN). The resulting plasmid pQE9faf/VCd and the repressor plasmid 
pREP4 were transformed into £ co//TG1(pARphoD) and TG1 (pARp/?oD-/acZ). 

In order to study the effect of TatAd/C d proteins on localisation of PhoD, 
strain TG1 (pARphoD, pREP4, pQE9fatVC d ) expression of phoD as well as 
tatAd/C d was induced with arabinose and IPTG. Unexpectedly, no PhoD could be 
detected in strain TG1(pARp/7oD, pREP4, pQE9faf>VC d ) using Western blotting 
(data not shown). Induction of TatA d /C d proteins in strain TG1 (pARphoD, pREP4, 
pQE9faf/VCc/) resulted in stable co-production of TatA d /C d proteins and the 
SPphoD-LacZ fusion protein (data not shown). SP PhoD -LacZ processing was 
analysed in presence and absence of TatA d /C d using pulse-chase labelling and 
subsequent immunoprecipitation with specific antibodies against LacZ. While in 
TG1(pARp/7oD'-LacZj no processing of SP Ph oD-LacZ could be observed (Fig. 
10A), in strain TG1 (pARpftoD, pREP4, pQE9tefcVCd) the peptide was at least 
partially processed (Fig. 10B). 

Since processing of the translocation product is an indication of membrane 
-translocation but does not necessarily prove that export of the protein has 
occurred, we examined whether LacZ could be localised in the periplasmic space 
in TG1(pARp/7oD, pREP4, pQE9fafA/C d ). As shown in table II the relative 
amount of periplasmic LacZ activity was significantly raised when compared to 
TGI(pARp/7o0-/acZ). Surprisingly, relative activity of LacZ in the strain 
expressing tatAd/Cd was much lower than compared to that of TG1(pARp/7oD'- 
lacZ). To monitor localisation of the LacZ peptide, cells of strain TG1 (pARphoD, 
pREP4, pQE9fafA/Cc/) were converted to spheroplasts, and treated with 
Proteinase K. As shown in Fig. 1 0B co-expressing tatAc/C d the fusion protein 
SPphoD-LacZ was completely susceptible to protease digestion in spheroplasts. 
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The resistance of SecB to the proteolytic digestion confirms the reliability of the 
method. Unexpectedly, both the processed form and the precursor of the fusion 
protein were accessible to the protease treatment. These results clearly show 
that the SPp h0 D-LacZ fusion protein is exported into the periplasmic space of E. 
coli when the B. subtilis tatAd/C d genes are co-expressed. 

TatAd/Cd-mediated transport of SP Ph0 D-LacZ needs delta pH dependent 
gradient at the cytosolic membrane and is Sec-independent - To directly proof . 
that the membrane translocation of the system is dependent on the pH gradient 
across the cytosolic membrane, Sec- and Tat-dependent protein translocation 
pathways were selectively blocked. Nigericin, an ionophore inhibiting the Tat- 
dependent protein translocation as a result of destroying the membrane potential 
(29), did efficiently block both, processing and transtocation of SPphoD-LacZ in 
TGI (pARphoD'-lacZ, pREP4, pQE9tatAd/Cd) (Fig. 11 A). Sodium azide (3 mM), 
which severely inhibits Sec-dependent protein export by interfering with the 
translocation-ATPase activity of the SecA protein (30), did not affect the 
localisation and the processing of the SP P(1 oD-LacZ fusion protein in this strain as 
shown in Fig. 11 B. 

TatAd/Cd-mediated transport of SPphoD-LacZ is not assisted by E. coli Tat 
components - Despite the above observations it can not be excluded that the E. 
coli Tat machinery assists TatA d /C d -mediated transport of SPphoD-LacZ. The E. 
coli tat genes are constitutively expressed in E. coli and therefore form a 
functional constitutive translocase unit (Jack et al. (2001) Constitutive expression 
of Escherichia coli tat genes indicates an important role for the twin-arginine 
translocase during aerobic and anaerobic growth. J. Bacterid 183, 1801-1804). 
To exclude co-operative action of B. subtilis and E. coli Tat proteins, E. coli strain 
TG1 was deleted for tatABCDE genes and subsequently transformed with 
plasmids pARphoD'-lacZ, pREP4 and pQE9fafA/Cd. Processing and localisation 
of the SPphoD-LacZ fusion protein was analysed under identical conditions as 
described for the E. coli tat+ strain. Despite the fact that the total amount of LacZ 
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found in the periplasmic fraction was reduced than compared to the E. coli tat wild 
type strain expressing phoD'-LacZ and tatAt/Cd, the relative amount of 
periplasmic LacZ was significantly elevated than compared to TG1 (pARpfrcD- 
LacZ) (Table II). As shown in Fig. 12 in absence of the E. coli tatABCDE genes 
most of the SP Ph0 D-LacZ hybrid protein was protease accessible demonstrating 
the extracytosolic localisation of SP P hoD-LacZ. The resistance of SecB to the 
proteolytic digestion demonstrated the stability of the spheroplasts (Fig. 13). 
Surprisingly, no processing of the SP Ph oD-LacZ fusion protein could be observed 
in absence of tatABCDE. Taken together, the B. subtilis Tat components 
TatAd/Cd can mediate translocation of the hybrid peptide consisting of the twin- 
arginine signal peptide of PhoD and LacZ. 

Although the foregoing invention has been described in some detail by way 
of illustration and example for purposes of clarity and understanding, it will be 
obvious that certain changes and modifications may be practiced within the scope 
of the appended claims. 
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Table I. Predicted Twin-Arginine Signal Peptides of B. subtilis* 



protein 



AlbB 
AmyX™ 
AppB™ 
LipA 
OppB™ 
PbpX 
PhoD 
QcrA 
SpoIIIJ 
TlpA™ 
WapA 
WprA 
YceA™ 
YdeJ 
YdhF 
YdhK 
YesM™ 
YesW 
YfkN™ 
YkpC 
YkuE 
YmaC 
YmzC 
YubF™ 
YuiC™ 
YvhJ 
YwbN 



signal peptide 



MSPAQRRIIiL' 
MVSIRRSFJ ' 
MAAYIIRR' 

mkfvkr: " 

MLKYIGR.^, 

mtsptrrrtakrrrrkl... 
maydsrfdewvqklkeesfqnntfdrrkfh 
mggkhdisrrq*" ' 
ml: 

mkktlttirrssiar 

mkkrkrrnf1 

MKRRKF, 

memfdlefmr: 

MKKRR] 



* Putative twin-arginine signal peptides \ 



MSAGKS YRKKMKQRRMNMKI S: 
MKKRVAGWYRRMKIKD! ' 
MRRSCLMIRRRKR^g... 

mriqkrr thvenil: 

MLRDLGRR|jg|| " 

mkkmsrrqfl: 
mrrfll: 

MFESEAEIiRR I 

MQKYRRRNTf||^ 
MMLNMIRRg®^ 
MAERVRVRVRKKKKSKRRKIL 

msdeqkkpeqihrrdil: 




identified in two ways. First, the presence of the c 
R-R-X-M> (<(> is a hydrophobic residue), immediately in front of an amino-terminal hydrophobic 
region as predicted with the TopPred2 algorithm (34, 35), was determined. To this purpose, the first 60 residues 
of all annotated proteins of B. subtilis in the SubtiList database (http://bioweb.pasteur.fr/Genolist/Subtilist.html) 
were used. Second, within the group of twin-arginine membrane sorting signals, cleavable signal peptides were 
identified with the SignalP algorithm (61, 62). Conserved residues of the twin-arginine consensus sequence (R- 
R-X-4H0 are indicated in bold. In addition, positively charged residues that could function as a so-called Sec- 
avoidance signal (54) are indicated in bold and italics. The hydrophobic H-domain is indicated in gray shading. 
In signal peptides with a predicted signal peptidase I cleavage site, residues from position -3 to -1 relative to the 
signal peptidase I cleavage site are underlined. Notably, some of these proteins contain one or more putative 
transmembrane segments elsewhere in the protein (indicated with "TM"), or are putative lipoproteins. Residues 
forming a so-called lipobox for signal peptidase II cleavage are enlarged in size. 
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Table II. Plasmids and Strains 



Plasn 



Relevant properties 



pUC21 
pJCdl 
pJCd2 
pJCyl 
P JCy2 
pMutin2 



pMICdJ 
pMICyl 



pDG792 
pDG1726 



K coli 
MC1061 



B. subtilis 



cloning vector; 32 kb; Ap r 
pUC21 derivative; carrying the tatCd gene; 5 .4 kb; Ap r 
pUC21 derivative for the disruption of lalCd; 6.3 kb; Ap r ; Km r 
pUC21 derivative; carrying the tatCy gene; 5.3 kb; Ap r 
pUC21 derivative for the disruption of iaiCy; 6.5kb; Ap r ; Sp r 
pBR322-based integration vector for B. subtilis; containing a 
multiple cloning site downstream of the Vspac promoter, and a 
promoter-less lacZ-gene preceded by the RBS of the spoVG 
gene; 8.6 kb; Ap r ; Em' 

pMutin2 derivative; carrying the 5' part of the B. subtilis tatCd 
gene 

pMutin2 derivative; carrying the 5' part of the B. subtilis tatCy 
gene 

contains a Km resistance cassette; 4.0 kb; Ap , Km 
contains a Sp resistance cassette; 3.9 kb; Ap r , Sp r 



trpC2 

AtatCd trpC2\ tatCd; Km' 

AtatCy 'rpC2; tatCy, Sp r 

ItatCd trpC2; Vspac-tatCd; tatCd-lacZ; Em r 

ItatCy . trpC2; Vspac-tatCy; tatCy-lacZ; Em r 

AtatCd-AtatCy trpC2;tatCd;Km';tatCy,Sp' 

ItatCd-AtatCy trpC2; Vspac-tatCd; tatC 



63 

This work 
This work 
This work 
This work 
31 



F"; araDm; A (aw/e«)7696; A (/ac)X74; galU; galK; hsdR2; 
mcrA;mcrB\;rspL 



This work 
This work 
This work 
This work 
This work 



WO 02/22667 



PCT/US01/29151 



Table III. (3-galactosidase activity (U/ODeoo)*- 



strain 


LPDM 


MM 


SSM 


TY 


168 


0 


0.1 ±0.1 


0.3 ± 0.2 


0.6 ±0.2 


ItatCd 


1.1 ±0.7 


0.1+0.1 


0.3 ±0.2 


0.5 ±0.2 


ItatCy 


6.112.5 


10.0 ±3.6 


4.0 ±2.0 


132 ±5.5 



(tatCylacZ) or the parental strain 168 (no iacZ gene fusion) were grown for 10 hours in LPDM, MM, SSM or 
TY medium after dilution' from an overnight culture. Samples for ji-galactosidase activity determinations were 
taken at hourly intervals, starting 4 hours after dilution from the overnight culture. As the p-galactosidase 
activities showed little variation during the entire period of sampling, average values were determined. The 
numbers in the table represent average values from 3 different experiments. Note that HPDM medium was used 
for the overnight culture of cells grown in LPDM medium, while overnight cultures of cells grown in MM, SSM 
or TY medium were prepared with the respective media. 
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Table IV. Twin-Arginine Signal Peptides of PhoD and PhoD-like proteins* 



signal peptide 



PhoD 
(Bsu) 



(Sco) 
SP4 



MAYDSRFDEWVQKLKEESFQNNTFDRRKFIQI 
MTPANHQAPTSAPSPAPSQSSHAPELRAAARSLGRRRFL' 
MAPTGRPSALAEHAFS PHDAVLGAAARHLGRRRFL' 



MTPAARPSQHAPELRAAARHLGRRRFL r . 




* Homologues of B. subtilis PhoD were identified by amino acid sequence similarity searches in GenBank using 
the BLAST algorithm. SP1 (Sco), gene SCC75A.32c of Streptomyces coelicolor (CAB61732); SP2 (Sco), gene 
SCF43A.18 of S. coelicolor (CAB48905); SP3 (Sco), gene SC4G6.37 of S. coelicolor (CAB51460), and SP4, 
phoD gene of Streptomyces tendae (CAB62565). GenBank accession numbers are indicated in parentheses. 
Conserved residues of the twin-arginine consensus sequence are indicated in bold. The hydrophobic H-region is 
indicated in in gray shading. Signal peptidase I recognition sequences predicted with the SignalP algorithm (61, 
62) ar 
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TABLE 5- Plasmids and Strains 



Plasmids 


Relevant properties 


Reference 


P AR3 


pACYC184 derived plasmid carrying the araB promoter 


25 




operator and the araC repressor gene from Salmonella 






typhimurium; Cm" 




pAKiphoD 


pAR3 derivative; carrying the phoD gene; Cm r 


This work 


pAR3phoD-lacZ 


pAR3 derivative; carrying a fusion gene consisting of 


This work 




the signal sequence region of phoD and lacZ; Cm' 




pQE9 


pBR322-based vector for IPTG-inducible synthesis of 


Qiagen 




His 6 -tagged proteins; Ap r 




pREP4 


plasmid; containing lacP repressor gene; Km r 


Qiagen 


pORI24 


plasmid; replicates only in E. coli rep* strains; Tc r 


37 


pMUTIN2 


pBR322-based integration vector for B. subtilis; 


38 




containing a multiple cloning site downstream of the 






?spac promoter, and a promoter-less lacZ-gene 






preceded by the RBS of the spoVG gene; Ap r ; Em 1 




pMUTIN2Wa- 


PMUTIN2 derivative; carrying a fusion gene consisting 


This work 


phoX> 


of signal sequence region of bla and phoD 




pQE9ta<4/Q 


pQE9 derivative; carrying the B. subtilis tatAJC d genes 


This work 


pFAT44 


pMAK705 (Hamilton et al, 1989) derivative plasmid 


7 




containing in frame deletion of E. coli tatE 




pFAT126 


pMAK705 derivative plasmid contouring in frame 


39 


Strains 


deletion of E. coli tatABCD 




E. coli 






TGI 


F~ araDU9 A(ara-leu)7696 A (lac)X74 galU galK 


40 




hsdBl mcrA mcrBl rspL 




TGI AtatABCE 


TGI AtatABCE 


This work 


B. subtilis 






168 


trpC2 


13 



a Cm r , chloramphenicol resistance marker; Ap r , ampicillin resistance marker; Km r , kanamycin 
resistance marker; Tc r , tetracycline resistance marker; Em r , erythromycin resistance marker 
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TABLE - Localisation offi-galactosidase activity in E. coli TGl(pAR3phoD-lacZ) strains. 
To investigate the translocation of the hybrid protein consisting of SPphoD and LacZ, cells of 
E. coli strains were grown in TY medium to exponential growth. Samples for fi-galactosidase 
activity determinations were taken from supernatants of lysozyme treated cells representing 
periplasmic activity and spheroplasts representing cell bound activity. Experiments were 
carried out with duplicated cultures. +/-, standard deviation. 



LacZ activity (units/OD 600 ) 

strain cell bound periplasmic total activity % export 

TGl(pAB3phoD-lacZ) 1108+/- 201 67+/- 5 1175 6.4+/- 3,4 

TGl(pAR3/>feaD-lacZ, 

pREP4,pQE9ta£VQ) 226+/- 11 94+/- 2 320 29.4+/- 0.4 

TGI AtatABCE (pAR3phoD-lacZ, 

nREP4.pQE9tot4yCJ 278+/- 8 39+/- 5 317 12.5+/- 0.9 
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What is claimed : 

1 . A chimeric polypeptide, comprising (i) a secretion signal peptide of PhoD or 
LipA derived from Bacillus subtilis and (ii) a heterologous polypeptide. 

2. The chimeric polypeptide of claim 1 , wherein said heterologous polypeptide 
is not naturally associated with any secretion signal peptide. 

3. A nucleic acid molecule comprising a first nucleotide sequence encoding a 
PhoD or LipA signal sequence operatively linked to a second nucleotide 
sequence encoding a heterologous polypeptide. 

4. A recombinant expression vector comprising a first DNA sequence 
encoding a PhoD or LipA signal sequence operatively linked to a second 
DNA sequence encoding a heterologous polypeptide. 

5. A host cell containing a recombinant expression vector comprising a first 
DNA sequence encoding a PhoD or LipA signal sequence operatively 
linked to a second DNA sequence encoding a heterologous polypeptide. 

6. The host cell of claim 5, wherein said polypeptide is not naturally 
associated with a secretion signal peptide. 

7. A method for producing a polypeptide, comprising culturing a host cell 
containing a recombinant expression vector comprising a first DNA 
sequence encoding a PhoD or LipA signal sequence operatively linked to a 
second DNA sequence encoding a heterologous polypeptide such that the 
heterologous polypeptide is produced by the host cell. 

8. The method of claim 7, wherein the polypeptide is secreted by the host cell 
into a culture medium. 

9. The method of claim 8, further comprising recovering the polypeptide from 
the culture medium. 

10. A method for producing a heterologous polypeptide in bacteria comprising: 
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(a) culturing bacterial cells that (i) lack a functional TatCy gene and 
(ii) contain a recombinant expression vector comprising a first 
DNA sequence encoding a PhoD or LipA signal sequence 
operatively linked to a second DNA sequence encoding a 
heterologous polypeptide such that the heterologous polypeptide 
is produced by the cells; and 

(b) recovering the heterologous polypeptide from the periplasm or 
the culture medium. 

1 1 .A process for producing a heterologous polypeptide in bacteria comprising: 

(a) culturing bacterial cells that (i) overexpress one or more B. 
subtilis Tat system genes encoding membrane-bound 
components thereof and (ii) contain a recombinant expression 
vector comprising a first DNA sequence encoding a PhoD or 
LipA signal sequence operatively linked to a second DNA 
sequence encoding a heterologous polypeptide such that the 
heterologous polypeptide is produced by the cells; and 

(b) recovering the heterologous polypeptide from the periplasm or 
the culture medium. 
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